python - AttributeError:“张量”对象没有“附加”属性
问题描述
我无法弄清楚为什么这段代码不起作用。当我将奖励放入列表时,我收到一条错误消息,告诉我尺寸不正确。我不知道该怎么办。
我正在实施一个强化深度 q 网络。r 是一个 numpy 二维数组,给出 1 除以停靠点之间的距离。这是为了让更近的站点获得更高的奖励。
无论我做什么,我都无法获得奖励以顺利运行。我是 Tensorflow 的新手,所以这可能只是因为我对 Tensorflow 占位符和 feed dicts 之类的东西缺乏经验。
在此先感谢您的帮助。
observations = tf.placeholder('float32', shape=[None, num_stops])
game states : r[stop], r[next_stop], r[third_stop]
actions = tf.placeholder('int32',shape=[None])
rewards = tf.placeholder('float32',shape=[None]) # +1, -1 with discounts
Y = tf.layers.dense(observations, 200, activation=tf.nn.relu)
Ylogits = tf.layers.dense(Y, num_stops)
sample_op = tf.random.categorical(logits=Ylogits, num_samples=1)
cross_entropies = tf.losses.softmax_cross_entropy(onehot_labels=tf.one_hot (actions,num_stops), logits=Ylogits)
loss = tf.reduce_sum(rewards * cross_entropies)
optimizer = tf.train.RMSPropOptimizer(learning_rate=0.001, decay=.99)
train_op = optimizer.minimize(loss)
visited_stops = []
steps = 0
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# Start at a random stop, initialize done to false
current_stop = random.randint(0, len(r) - 1)
done = False
# reset everything
while not done: # play a game in x steps
observations_list = []
actions_list = []
rewards_list = []
# List all stops and their scores
observation = r[current_stop]
# Add the stop to a list of non-visited stops if it isn't
# already there
if current_stop not in visited_stops:
visited_stops.append(current_stop)
# decide where to go
action = sess.run(sample_op, feed_dict={observations: [observation]})
# play it, output next state, reward if we got a point, and whether the game is over
#game_state, reward, done, info = pong_sim.step(action)
new_stop = int(action)
reward = r[current_stop][action]
if len(visited_stops) == num_stops:
done = True
if steps >= BATCH_SIZE:
done = True
steps += 1
observations_list.append(observation)
actions_list.append(action)
rewards.append(reward)
#rewards_list = np.reshape(rewards, [-1, 25])
current_stop = new_stop
#processed_rewards = discount_rewards(rewards, args.gamma)
#processed_rewards = normalize_rewards(rewards, args.gamma)
print(rewards)
sess.run(train_op, feed_dict={observations: [observations_list],
actions: [actions_list],
rewards: [rewards_list]})
解决方案
该行rewards.append(reward)
导致错误,这是因为您的rewards
变量是张量,正如您在其中定义的那样rewards = tf.placeholder('float32',shape=[None])
,您不能像那样将值附加到张量。你可能想打电话给rewards_list.append(reward)
.
此外,您正在初始化变量
observations_list = []
actions_list = []
rewards_list = []
在循环内部,因此在每次迭代中,ols 值将被空列表覆盖。您可能希望在该行之前有这 3while not done:
行。
推荐阅读
- r - 比较复杂结构的列表
- neural-network - 从 TensorFlow.JS 层移除(修剪)神经元
- jquery - 具有多个选项卡的 JQuery 数据表
- sql - 如果一个表中为空,则从其他中选择
- c++ - 跨函数调用包装器的 noreturn 属性
- android - 如何在 AOSP 构建中包含自定义 splash.img
- java - 如何在 Appium Java 中创建适用于 Android 和 iOS 的页面对象模型
- python-3.x - “utf-8”编解码器无法解码位置 10 中的字节 0xb5:无效的起始字节
- angularjs - AngularJS 控制器在点击时不接受传递 din 值
- nuget - 在为 x86 和 x64 创建 Nuget 包时,我应该为需要的 .dll 文件使用什么文件夹结构?