首页 > 技术文章 > 神经网络反向传播算法公式推导

chantmee 2021-10-01 18:49 原文

神经网络示意图

规定

  • \(y_{ij}\)为第\(i\)层网络第\(j\)个神经元的输出.
  • \(t_i\)为输出层第\(i\)个输出.
  • \(n_i\)为第\(i\)层网络的神经元数量.
  • 激活函数\(\sigma(x)=Sigmod(x)=\frac{1}{1+e^{-x}}\),因此\(\frac{\partial \sigma(x)}{\partial x}=\sigma(x)[1-\sigma(x)]\).
  • \(E\)代表误差,即\(E=\sum_{i=1}^{2}(y_{3i}-t_i)^{2}\).
  • \(\nabla_{ijk}\)为第\(i\)层网络第\(j\)个神经元,它对上一层网络的第\(k\)个神经元的\(\omega\)的梯度值.

公式推导

现在以\(11\rightarrow 21 \rightarrow 31\)该过程为例,同时导出通项公式。

1.\(net_{ij}\)

\(net_{21}=\sum_{i=1}^{2}(\omega_{21i}y_{1i})\).

\(net_{31}=\sum_{i=1}^{3}(\omega_{31i}y_{2i})\).

因此通项公式为:\(net_{ij}=\sum_{k=1}^{n_{i-1}}(\omega_{ijk}y_{i-1,k})\), \(y_{ij}=\sigma(net_{ij})\).

2.\(y_{ij}\)

\(y_{21}=\sigma(net_{21})\).

\(y_{31}=\sigma(net_{31})\).

因此通项公式为:\(y_{ij}=\sigma(net_{ij})\).

3.误差\(E\)

\(E=\sum_{i=1}^{2}(y_{3i}-t_{i})^2\).

4.\(\nabla_{3ij}\)(输出层各\(\omega\)的梯度值)

\(\frac{\partial E}{\partial net_{31}}=\frac{E}{\partial y_{31}}\cdot \frac{\partial y_{31}}{\partial {net_{31}}}=2(y_{31}-t_1)y_{31}(1-y_{31})\).

\(\therefore \frac{E}{\partial net_{3i}}=2(y_{3i}-t_i)y_{3i}(1-y_{3i})\).

\(\nabla_{311}=\frac{\partial E}{\partial net_{31}} \cdot \frac{\partial net_{31}}{\partial \omega_{311}}=\frac{\partial E}{\partial net_{31}}\cdot y_{21}\).

因此通项公式为:\(\nabla_{3ij}=\frac{\partial E}{\partial net_{3i}}\cdot \frac{\partial net_{3i}}{\partial \omega_{3ij}}=2(y_{3i}-t_{i})y_{3i}(1-y_{3i})y_{2i}\).

5.\(\nabla_{2ij}\)(隐藏层各\(\omega\)的梯度值)

\(\nabla_{211}=\frac{\partial E}{\partial \omega_{211}}=\frac{\partial E}{\partial net_{31}}\cdot \frac{\partial net_{31}}{\partial y_{21}}\cdot \frac{\partial y_{21}}{\partial net_{21}}\cdot \frac{\partial{net_{21}}}{\partial \omega_{211}}+\frac{\partial E}{\partial net_{32}}\cdot \frac{\partial net_{32}}{\partial y_{21}}\cdot \frac{\partial y_{21}}{\partial net_{21}}\cdot \frac{\partial{net_{21}}}{\partial \omega_{211}}\\=\sum_{i=1}^{2}(\frac{\partial E}{\partial net_{3i}}\cdot \frac{\partial net_{3i}}{\partial y_{2i}})\cdot \frac{\partial y_{21}}{\partial net_{21}} \cdot \frac{\partial net_{21}}{\partial \omega_{211}}\\=\sum_{i=1}^{2}(\frac{\partial E}{\partial net_{3i}}\cdot \omega_{3i1})\cdot y_{21}(1-y_{21})\omega_{21}\)

因此通项公式为:\(\nabla_{2ij}=\frac{\partial E}{\partial \omega_{2ij}}=\sum_{k=1}^{2}(\frac{\partial E}{\partial net_{3k}}\cdot \omega_{3ki})\cdot y_{2i}(1-y_{2i})\omega_{2ij}\).

推荐阅读