python - 每个 mini-Batch 是否只更新一次权重/偏差?
问题描述
我正在关注神经网络教程,我对更新权重的函数有疑问。
def update_mini_batch(self, mini_batch, eta):
"""Update the network's weights and biases by applying
gradient descent using backpropagation to a single mini batch.
The "mini_batch" is a list of tuples "(x, y)", and "eta"
is the learning rate."""
nabla_b = [np.zeros(b.shape) for b in self.biases] #Initialize bias matrix with 0's
nabla_w = [np.zeros(w.shape) for w in self.weights] #Initialize weights matrix with 0's
for x, y in mini_batch: #For tuples in one mini_batch
delta_nabla_b, delta_nabla_w = self.backprop(x, y) #Calculate partial derivatives of bias/weights with backpropagation, set them to delta_nabla_b
nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)] #Generate a list with partial derivatives of bias of every neuron
nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)] #Generate a list with partial derivatives of weights for every neuron
self.weights = [w-(eta/len(mini_batch))*nw #Update weights according to update rule
for w, nw in zip(self.weights, nabla_w)] #What author does is he zips 2 lists with values he needs (Current weights and partial derivatives), then do computations with them.
self.biases = [b-(eta/len(mini_batch))*nb #Update biases according to update rule
for b, nb in zip(self.biases, nabla_b)]
我在这里不明白的是使用 for 循环来计算 nabla_b 和 nabla_w (权重/偏差的偏导数)。对小批量中的每个训练示例进行反向传播,但只更新一次权重/偏差。
在我看来,假设我们有一个大小为 10 的小批量,我们计算 nabla_b 和 nabla_w 10 次,然后在 for 循环完成后更新权重和偏差。但是 for 循环不是每次都重置 nabla_b 和 nabla_b 列表吗?为什么我们不更新self.weights
并self.biases
在for 循环内?
神经网络工作得很好,所以我认为我在某个地方犯了一个小错误。
仅供参考:我正在关注的教程的相关部分可以在这里找到
解决方案
不,更新发生在批次结束后,依次应用每个训练更新。规范描述说我们计算所有更新的平均值并根据该平均值进行调整;反过来,通过每次更新进行调整在算术上是等效的。
首先,初始化偏差和权重数组。
nabla_b = [np.zeros(b.shape) for b in self.biases] #Initialize bias matrix with 0's
nabla_w = [np.zeros(w.shape) for w in self.weights] #Initialize weights matrix with 0's
对于迷你匹配中的每个观察,将训练结果插入到偏差和权重数组中
for x, y in mini_batch: #For tuples in one mini_batch
delta_nabla_b, delta_nabla_w = self.backprop(x, y) #Calculate partial derivatives of bias/weights with backpropagation, set them to delta_nabla_b
nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)] #Generate a list with partial derivatives of bias of every neuron
nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)] #Generate a list with partial derivatives of weights for every neuron
最后,调整每个权重和偏差,依次调整每个训练结果的值。
self.weights = [w-(eta/len(mini_batch))*nw #Update weights according to update rule
for w, nw in zip(self.weights, nabla_w)] #What author does is he zips 2 lists with values he needs (Current weights and partial derivatives), then do computations with them.
self.biases = [b-(eta/len(mini_batch))*nb #Update biases according to update rule
for b, nb in zip(self.biases, nabla_b)]
推荐阅读
- google-chrome - 使用 Pyppeteer 检测自动打开的选项卡
- c++ - Clang 如何处理多个同名源文件?
- php - 在同一查询中使用另一个选择从表中选择
- java - 老一代的长期幸存者会终身任职多久?
- ansible - Chef:如何让 Chef 仅在一个命令中将指令应用于特定节点
- android - Chrome 自定义标签和 WebView 的区别
- android - Firebase 动态链接无法缩短短动态链接
- jpa - commandlink 中的 setPropertyActionListener 不起作用
- fonts - OTS 解析错误:无法从字体数据中实例化字体
- python - 将 Matlab @ 转换为 Python 代码(RuntimeWarning:在 true_divide 中遇到无效值)