Actor-critic-based algorithms