首页 > 解决方案 > 张量的 numpy.random.choice 语句的替代品是什么?

问题描述

我正在尝试实现一种学习算法,该算法具有以下代码块来选择具有概率分布的动作并使用np.random.choice

states.append(state)
probs = self.policy.forward(Variable(torch.from_numpy(state).float().unsqueeze(0)))
highest_prob_action = np.random.choice(self.num_actions, p=np.squeeze(probs.numpy()))
log_prob = torch.log(probs.squeeze(0)[highest_prob_action])
action_l.append(highest_prob_action)

我尝试使用torch.multinomial如下:

 highest_prob_action = np.random.choice(torch.multinomial(probs, self.num_actions).squeeze(0))

但它返回错误:

    130             states.append(state)
    131             probs = self.policy.forward(Variable(torch.from_numpy(state).float().unsqueeze(0)))
--> 132             highest_prob_action = np.random.choice(torch.multinomial(probs, self.num_actions).squeeze(0))
    133 #             highest_prob_action = np.random.choice(self.num_actions, p=np.squeeze(probs.numpy()))
    134             log_prob = torch.log(probs.squeeze(0)[highest_prob_action])

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

标签: pythonnumpypytorch

解决方案


推荐阅读