首页 > 解决方案 > 将maddpg应用于particalenv时添加噪声的方法

问题描述

根据 OpenAI 的论文(https://arxiv.org/pdf/1706.02275.pdf),我们应该在政策中添加噪音以确保探索。在示例代码中,有一种添加噪声的方法:

u = torch.rand_like(model_out)

policy = F.softmax(model_out - torch.log(-torch.log(u)), dim=-1)

它与 simple_spread env 配合得很好,而当我简单地向 model_out 添加一个高斯噪声缩放器时,覆盖的时间变得很长。这个怎么运作?

标签: pytorchreinforcement-learningmulti-agent

解决方案


噪声的添加会影响通过该层的反向传播。例如,对于量化层,人们通常使用直通估计器(如果我没记错的话,它基本上只是梯度裁剪)。

请注意,在剪裁渐变时,应在穿过噪声层时完成。本文展示了如何裁剪渐变然后添加噪声。请参阅第3 页的第 3.1 节 差分私有 SGD 算法,“算法 1 ” :https ://arxiv.org/pdf/1607.00133.pdf

但要回答您的问题,这是您需要添加噪音的代码:

class GaussianNoise(nn.Module):
"""Gaussian noise regularizer.

Args:
    sigma (float, optional): relative standard deviation used to generate the
        noise. Relative means that it will be multiplied by the magnitude of
        the value your are adding the noise to. This means that sigma can be
        the same regardless of the scale of the vector.
    is_relative_detach (bool, optional): whether to detach the variable before
        computing the scale of the noise. If `False` then the scale of the noise
        won't be seen as a constant but something to optimize: this will bias the
        network to generate vectors with smaller values.
"""
def __init__(self, sigma=0.1, is_relative_detach=True):
    super().__init__()
    self.sigma = sigma
    self.is_relative_detach = is_relative_detach
    self.register_buffer('noise', torch.tensor(0))

def forward(self, x):
    if self.training and self.sigma != 0:
        scale = self.sigma * x.detach() if self.is_relative_detach else self.sigma * x
        sampled_noise = self.noise.expand(*x.size()).float().normal_() * scale
        x = x + sampled_noise
    return x 

*您可能还想滚动浏览此线程: https ://discuss.pytorch.org/t/how-to-add-noise-to-mnist-dataset-when-using-pytorch/59745/17


推荐阅读