tensorflow - TF2:在非急切模式下计算 keras 回调中的梯度
问题描述
TF 版本:2.2.0-rc3(在 Colab 中)
我在回调中使用以下代码(来自tf.keras 在训练期间获取计算梯度)来计算模型中所有参数的梯度。
def on_train_begin(self, logs=None):
# Functions return weights of each layer
self.layerweights = []
for lndx, l in enumerate(self.model.layers):
if hasattr(l, 'kernel'):
self.layerweights.append(l.kernel)
input_tensors = [self.model.inputs[0],
self.model.sample_weights[0],
self.model.targets[0],
K.learning_phase()]
# Get gradients of all the relevant layers at once
grads = self.model.optimizer.get_gradients(self.model.total_loss, self.layerweights)
self.get_gradients = K.function(inputs=input_tensors,outputs=grads)
但是,当我运行它时,我收到以下错误。
AttributeError: 'Model' object has no attribute 'sample_weights'
因为model.targets
同样的错误也在发生。
如何在回调中获取渐变?
在 Eager 模式下,解决方案Get Gradients with Keras Tensorflow 2.0有效。但是,我想在非急切模式下使用它。
解决方案
这是使用 keras 后端捕获渐变的端到端代码。我已经从 model.fit 的回调中调用梯度捕获函数来捕获每个时期结束后的梯度。此代码在 tensorflow 1.x 和 tensorflow 2.x 版本中都兼容,而且我已经在 colab 中运行过它。如果您想在 tensorflow 1.x 中运行,请将程序中的第一条语句替换为%tensorflow_version 1.x
并重新启动运行时。
捕捉模型的梯度 -
# Importing dependency
%tensorflow_version 2.x
from tensorflow import keras
from tensorflow.keras import backend as K
from tensorflow.keras import datasets
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout, Flatten, Conv2D, MaxPooling2D
from tensorflow.keras.layers import BatchNormalization
import numpy as np
import tensorflow as tf
tf.keras.backend.clear_session() # For easy reset of notebook state.
tf.compat.v1.disable_eager_execution()
# Import Data
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()
# Build Model
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(10))
# Model Summary
model.summary()
# Model Compile
model.compile(optimizer='adam',
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
# Define the Gradient Fucntion
epoch_gradient = []
# Define the Gradient Function
def get_gradient_func(model):
grads = K.gradients(model.total_loss, model.trainable_weights)
inputs = model._feed_inputs + model._feed_targets + model._feed_sample_weights
func = K.function(inputs, grads)
return func
# Define the Required Callback Function
class GradientCalcCallback(keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs=None):
get_gradient = get_gradient_func(model)
grads = get_gradient([train_images, train_labels, np.ones(len(train_labels))])
epoch_gradient.append(grads)
epoch = 4
model.fit(train_images, train_labels, epochs=epoch, validation_data=(test_images, test_labels), callbacks=[GradientCalcCallback()])
# (7) Convert to a 2 dimensiaonal array of (epoch, gradients) type
gradient = np.asarray(epoch_gradient)
print("Total number of epochs run:", epoch)
print("Gradient Array has the shape:",gradient.shape)
输出 -
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 30, 30, 32) 896
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 15, 15, 32) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 13, 13, 64) 18496
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 6, 6, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 4, 4, 64) 36928
_________________________________________________________________
flatten (Flatten) (None, 1024) 0
_________________________________________________________________
dense (Dense) (None, 64) 65600
_________________________________________________________________
dense_1 (Dense) (None, 10) 650
=================================================================
Total params: 122,570
Trainable params: 122,570
Non-trainable params: 0
_________________________________________________________________
Train on 50000 samples, validate on 10000 samples
Epoch 1/4
50000/50000 [==============================] - 73s 1ms/sample - loss: 1.8199 - accuracy: 0.3834 - val_loss: 1.4791 - val_accuracy: 0.4548
Epoch 2/4
50000/50000 [==============================] - 357s 7ms/sample - loss: 1.3590 - accuracy: 0.5124 - val_loss: 1.2661 - val_accuracy: 0.5520
Epoch 3/4
50000/50000 [==============================] - 377s 8ms/sample - loss: 1.1981 - accuracy: 0.5787 - val_loss: 1.2625 - val_accuracy: 0.5674
Epoch 4/4
50000/50000 [==============================] - 345s 7ms/sample - loss: 1.0838 - accuracy: 0.6183 - val_loss: 1.1302 - val_accuracy: 0.6083
Total number of epochs run: 4
Gradient Array has the shape: (4, 10)
希望这能回答你的问题。快乐学习。
推荐阅读
- css - mat-expansion-panel 添加下拉打开问题
- android - 为什么自从我更新 Android Studio 后,我遇到了无法修复的问题?
- java - 错误:java.lang.NoSuchMethodError:org.hibernate.integrator.internal.IntegratorServiceImpl。
- angular-material - Angular 材质密码字段无法在 IE 中正确呈现(最新版本)
- node.js - 暂停循环,直到子进程在 Node.js 中发送消息
- javascript - 使用 AngularJS ng-bind 字段从网页中抓取数据的 Chrome 扩展代码
- kotlin - Kotlin 返回 Null 不能是非 null 类型的值
- java - 我应该在哪里保存我的文件在 Android 中以供本地访问?
- session - 如何删除 TYPO3 中的会话?
- php - 使用 PHP PDO 根据之前的选择动态更新下拉列表