python - 如何在 Tensorflow 中获得损失 wrt 模型预测的梯度?
问题描述
我想计算误差梯度:(dJ/dredictionp
如果J
是成本函数)。在函数train_step()
中,您可以看到梯度是根据模型权重计算的。
当我试图计算像这样的梯度时gradients = tape.gradient(loss, predictions)
,它返回None
了,这意味着我的损失函数不依赖于预测。
这个怎么可能?
class SimpleModel(models.Model):
def __init__(self, nb_classes, X_dim: int, batch_size: int):
super().__init__()
self.model_input_layer = layers.InputLayer(input_shape=(X_dim,), batch_size=batch_size)
self.d1 = layers.Dense(64, name="d1")
self.a1 = layers.Activation("relu", name="a1")
self.d2 = layers.Dense(32, name="d2")
self.a2 = layers.Activation("relu", name="a2")
self.d3 = layers.Dense(nb_classes, name="d3")
self.a3 = layers.Activation("softmax", name="a3")
self.model_input = None
self.d1_output = None
self.a1_output = None
self.d2_output = None
self.a2_output = None
self.d3_output = None
self.a3_output = None
def call(self, inputs, training=None, mask=None):
self.model_input = self.model_input_layer(inputs)
self.d1_output = self.d1(self.model_input)
self.a1_output = self.a1(self.d1_output)
self.d2_output = self.d2(self.a1_output)
self.a2_output = self.a2(self.d2_output)
self.d3_output = self.d3(self.a2_output)
self.a3_output = self.a3(self.d3_output)
return self.a3_output
model = SimpleModel(NB_CLASSES, X_DIM, BATCH_SIZE)
model.build((BATCH_SIZE, X_DIM))
optimizer = Adam()
loss_object = losses.CategoricalCrossentropy()
train_loss = metrics.Mean(name='train_loss')
test_loss = metrics.Mean(name='test_loss')
@tf.function
def train_step(X, y):
with tf.GradientTape() as tape:
predictions = model(X)
loss = loss_object(y, predictions)
gradients = tape.gradient(loss, model.trainable_weights)
optimizer.apply_gradients(zip(gradients, model.trainable_weights))
train_loss(loss)
解决方案
问题是GradientTape
默认情况下只跟踪可训练变量,而不跟踪其他张量。因此,您需要明确告诉它跟踪感兴趣的张量。尝试这个:
predictions = model(X) # if you also need gradients for model variables, move this back into the tape context
with tf.GradientTape() as tape:
tape.watch(predictions)
loss = loss_object(y, predictions)
gradients = tape.gradient(loss, [predictions])
注意使用该watch
方法来跟踪任意张量。这不应该再回来None
了。
推荐阅读
- sql - 迭代表并附加到新表/游标和更新/删除新表/游标的最佳方法
- django - 如何在 djan 的 USER 模型中再添加两列
- azure - 从子文件夹复制所有文件,将相同的结构移动到存档文件夹并从 Azure 数据工厂中的源删除
- css - 带有 Ant Design 库的 BackToTop 按钮
- java - 如何通过循环存储每个人的价值而不被重新估价?
- docker - 如何在 Intellij Idea 的 docker 容器上编译 Flutter 应用程序?
- c# - 如何获得每个月一周内的最小最大天数?
- reactjs - 反应:模块解析失败
- java - 如何将活动背景重置为主题默认值
- laravel - Laravel 存储链接不与公共链接