首页 > 解决方案 > 非零向量上的 TensorFlow 操作

问题描述

我为此花了大约两个小时,但找不到解决方案。我需要的最接近的东西可能是这个布尔掩码,但我仍然错过了下一步。

我的神经网络没有学习,所以我开始查看它执行的每一步。果然我发现了一个问题。问题在于,由于我的输入层的稀疏性,我得到了太多的偏差项。我设置的唯一性是最后一个time矩阵将是零矩阵。让我告诉你,我将首先显示我的笔记本的屏幕截图,然后将呈现代码。

截屏:

在此处输入图像描述

我不希望将偏差项添加到整体time为零矩阵的位置。我想我也许可以对布尔掩码过滤矩阵执行操作?

这是代码:

import tensorflow as tf
import numpy as np

dim = 4
# batch x time x events x dim
tensor = np.random.rand(1, 3, 4, dim)
zeros_last_time = np.zeros((4, dim))
tensor[0][2] = zeros_last_time

input_layer = tf.placeholder(tf.float64, shape=(None, None, 4, dim))

# These are supposed to perform operations on the non-zero times
Wn = tf.Variable(
    tf.truncated_normal(dtype=dtype, shape=(dim,), mean=0, stddev=0.01),
    name="Wn")
bn = tf.Variable(tf.truncated_normal(dtype=dtype, shape=(1,), mean=0, 
stddev=0.01), name="bn")

# this is the op I want to be performed only on non-zero times
op = tf.einsum('bted,d->bte', input_layer, Wn) + bn

s = tf.Session()
glob_vars = tf.global_variables_initializer()
s.run(glob_vars)

# first let's see what the bias term is
s.run(bn, feed_dict={input_layer: tensor})

s.run(op, feed_dict={input_layer: tensor})

编辑:所以我相信tf.where这是我需要的。

标签: pythontensorflow

解决方案


也许一个好的解决方案可以使用tf.where创建一个零掩码,其中输入为零(在最后一维中),否则为一。一旦我们得到这个掩码,我们就可以将它乘以偏差来得到结果。这是我的解决方案:

import tensorflow as tf
import numpy as np

dim = 4
# batch x time x events x dim
tensor = np.random.rand(1, 3, 4, dim)
zeros_last_time = np.zeros((4, dim))
tensor[0][2] = zeros_last_time
dtype = tf.float64
input_layer = tf.placeholder(tf.float64, shape=(None, None, 4, dim))

# These are supposed to perform operations on the non-zero times
Wn = tf.Variable(
    tf.truncated_normal(dtype=dtype, shape=(dim,), mean=0, stddev=0.01),
    name="Wn")
bn = tf.Variable(
    tf.truncated_normal(dtype=dtype, shape=(1,), mean=0, stddev=0.01),
    name="bn")

bias = bn * tf.cast(
    tf.where(input_layer == tf.zeros(tf.shape(input_layer)[-1]),
             tf.zeros(tf.shape(input_layer)[-1]),
             tf.ones(tf.shape(input_layer)[-1])), dtype)

# this is the op I want to be performed only on non-zero times
op = tf.einsum('bted,d->bte', input_layer, Wn) + bias

s = tf.Session()
glob_vars = tf.global_variables_initializer()
s.run(glob_vars)

# first let's see what the bias term is
print(s.run(bn, feed_dict={input_layer: tensor}))
print(s.run(op, feed_dict={input_layer: tensor}))

推荐阅读