DdpgMlpActor¶
- class torchrl.modules.DdpgMlpActor(action_dim: int, mlp_net_kwargs: dict | None = None, device: DEVICE_TYPING | None = None)[原始碼]¶
DDPG Actor 類。
在“CONTINUOUS CONTROL WITH DEEP REINFORCEMENT LEARNING”中提出,https://arxiv.org/pdf/1509.02971.pdf
DDPG Actor 以觀測向量作為輸入,並從中返回一個動作。它被訓練來最大化 DDPG Q 值網路返回的值。
- 引數:
action_dim (int) – 動作向量的長度
mlp_net_kwargs (dict, optional) –
MLP 的關鍵字引數。預設為
>>> { ... 'in_features': None, ... 'out_features': action_dim, ... 'depth': 2, ... 'num_cells': [400, 300], ... 'activation_class': nn.ELU, ... 'bias_last_layer': True, ... }
device (torch.device, optional) – 建立模組的裝置。
示例
>>> import torch >>> from torchrl.modules import DdpgMlpActor >>> actor = DdpgMlpActor(action_dim=4) >>> print(actor) DdpgMlpActor( (mlp): MLP( (0): LazyLinear(in_features=0, out_features=400, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=400, out_features=300, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=300, out_features=4, bias=True) ) ) >>> obs = torch.zeros(10, 6) >>> action = actor(obs) >>> print(action.shape) torch.Size([10, 4])