MarlGroupMapType¶
- torchrl.envs.MarlGroupMapType(value, names=None, *, module=None, qualname=None, type=None, start=1)[原始碼]¶
Marl 組對映型別。
作為 torchrl 多代理功能的一項特性,您可以控制環境中代理的 agrupamento。您可以將代理分組(堆疊它們的張量)以在將它們透過相同的神經網路時利用向量化。您可以將代理分割成不同的組,其中它們是異構的或應該由不同的神經網路處理。要進行分組,您只需在環境構造時傳遞一個
group_map。另外,您可以從此類中選擇一種預定義的組合策略。
當
group_map=MarlGroupMapType.ALL_IN_ONE_GROUP和代理["agent_0", "agent_1", "agent_2", "agent_3"]時,從您的環境中進出的 tensordicts 將如下所示:>>> print(env.rand_action(env.reset())) TensorDict( fields={ agents: TensorDict( fields={ action: Tensor(shape=torch.Size([4, 9]), device=cpu, dtype=torch.int64, is_shared=False), done: Tensor(shape=torch.Size([4, 1]), device=cpu, dtype=torch.bool, is_shared=False), observation: Tensor(shape=torch.Size([4, 3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)}, batch_size=torch.Size([4]))}, batch_size=torch.Size([])) >>> print(env.group_map) {"agents": ["agent_0", "agent_1", "agent_2", "agent_3]}
當
group_map=MarlGroupMapType.ONE_GROUP_PER_AGENT和代理["agent_0", "agent_1", "agent_2", "agent_3"]時,從您的環境中進出的 tensordicts 將如下所示:>>> print(env.rand_action(env.reset())) TensorDict( fields={ agent_0: TensorDict( fields={ action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False), done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False), observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)}, batch_size=torch.Size([]))}, agent_1: TensorDict( fields={ action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False), done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False), observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)}, batch_size=torch.Size([]))}, agent_2: TensorDict( fields={ action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False), done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False), observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)}, batch_size=torch.Size([]))}, agent_3: TensorDict( fields={ action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False), done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False), observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)}, batch_size=torch.Size([]))}, batch_size=torch.Size([])) >>> print(env.group_map) {"agent_0": ["agent_0"], "agent_1": ["agent_1"], "agent_2": ["agent_2"], "agent_3": ["agent_3"]}