tf2rl.policies package

Submodules

tf2rl.policies.tfp_categorical_actor module

class tf2rl.policies.tfp_categorical_actor.CategoricalActor(*args, **kwargs)

Bases: tensorflow.python.keras.engine.training.Model

__init__(state_shape, action_dim, units=(256, 256), hidden_activation='relu', name='CategoricalActor')
compute_prob(states)
call(states, test=False)

Calls the model on new inputs.

In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).

Parameters
  • inputs – A tensor or list of tensors.

  • training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.

  • mask – A mask or list of masks. A mask can be either a tensor or None (no mask).

Returns

A tensor if there is a single output, or a list of tensors if there are more than one outputs.

compute_entropy(states)
compute_log_probs(states, actions)

Compute log probabilities of state-action pairs

Parameters
  • states – tf.Tensor Tensors of inputs to NN

  • actions – tf.Tensor Tensors of NOT one-hot vector. They will be converted to one-hot vector inside this function.

Returns

Log probabilities

class tf2rl.policies.tfp_categorical_actor.CategoricalActorCritic(*args, **kwargs)

Bases: tf2rl.policies.tfp_categorical_actor.CategoricalActor

__init__(*args, **kwargs)
call(states, test=False)

Calls the model on new inputs.

In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).

Parameters
  • inputs – A tensor or list of tensors.

  • training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.

  • mask – A mask or list of masks. A mask can be either a tensor or None (no mask).

Returns

A tensor if there is a single output, or a list of tensors if there are more than one outputs.

tf2rl.policies.tfp_gaussian_actor module

class tf2rl.policies.tfp_gaussian_actor.GaussianActor(*args, **kwargs)

Bases: tensorflow.python.keras.engine.training.Model

LOG_STD_CAP_MAX = 2
LOG_STD_CAP_MIN = -20
EPS = 1e-06
__init__(state_shape, action_dim, max_action, units=(256, 256), hidden_activation='relu', state_independent_std=False, squash=False, name='gaussian_policy')
call(states, test=False)

Compute actions and log probabilities of the selected action

compute_log_probs(states, actions)
compute_entropy(states)

Module contents