c# - Back propogation on multi layered neural networks
问题描述
I am making a neural network system with c# without using any libraries or Accord.Net. But i am stuck on how to back propogate my error. Do i have to include all layers that i have already propagated to the next layer, or only the previous layer gets in the equation?
Edit for more information:
My network structure is mostly dynamic. It creates a neural network with user input with how many layers and node count per layer. It has input and output layer created based on the dataset used. It can use linear, sigmoid, tanh or relu activation functions on layers, and you can mix match them per layer.
I do understand how backpropogation works and its use. But every example i see use it on only 3 layered structures with 1 input, 1 hidden and 1 output layer. They calculate output layer error and update its weights. Then they calculate hidden layer's error with output layer included.
My problem starts here. They dont show as if only the layer before the hidden layer (thinking as you go right to left for back propogation) is included, or all the layer till output layers are included in the error equation.
For visualization
input layer ---> hidden layer 1 ---> hidden layer 2 ---> output layer
In this example, when i calculate the hidden layer1's error and weight update, do i only include hidden layer 2, or hidden layer 2 + output layer?
解决方案
I wonder what you mean by "include". Backpropagation is supposed to compute the gradient. The gradient is the derivative of each variable against the loss function (you call this the error, but that term is not quite precise. It's not an error, it's a slope). After the gradient is computed all parameters (the "weights") are updated at once.
Computing the gradient essentially is numeric differentiation. If you have a * b = c
, you have all a
, b
and c
and gradient(c)
, then it is easy to compute the gradient for a
and b
as well (gradient(a) = b * gradient(c)
).
So you push the gradient layer by layer backwards. For each layer you only need the gradient of the next layer. Frameworks such as TensorFlow do this automatically for you. The technique works for any computational graph, not just for neural networks of the structure you described. Understanding the general concept of numeric differentiation along a computational graph first makes it easy to understand the special case of a neural network.
推荐阅读
- wordpress - 我需要帮助生成基于用户 ID 的自定义 URL
- c# - 如何加速试图决定哪个节点去哪个节点列表的算法,使其最终在爬山算法中获得最佳分数
- laravel - QuickBooks API“无法打开所需文件”
- javascript - 如何在使用 html+javascript 和节点时显示警告框
- python-2.6 - 如何在 sqlalchemy 事件“after_update”中获取修改列或更新语句
- vector - PySpark:从组中的值创建一个向量
- javascript - 除了“标题”和“副标题”之外,如何自定义或向 FlatList、ListItem 添加更多字段
- c++ - Gmock 匹配器匹配类型
- sql-server - 使用 jdbc 时关键字“IF”附近的语法不正确
- node.js - 如何在express的post方法中使用Session