python - 理解“backward()”:如何从头开始编写 Pytorch 函数“.backward()”?
问题描述
我是一个学习深度学习的新手,我一直在试图理解 Pytorch 的“.backward()”是做什么的,因为它在那里完成了大部分工作。因此,我试图详细了解后向函数的作用,因此,我将尝试逐步编写函数的作用。您可以向我推荐任何资源(书籍、视频、GitHub 存储库)来开始编写该函数吗?感谢您的时间,并希望您的帮助。
解决方案
backward()
正在计算关于(wrt)图叶的梯度。
grad()
函数更通用,它可以计算任何输入的梯度(包括叶子)。
我实现了这个grad()
功能,前段时间,你可以检查一下。它使用自动微分 (AD) 的力量。
import math
class ADNumber:
def __init__(self,val, name=""):
self.name=name
self._val=val
self._children=[]
def __truediv__(self,other):
new = ADNumber(self._val / other._val, name=f"{self.name}/{other.name}")
self._children.append((1.0/other._val,new))
other._children.append((-self._val/other._val**2,new)) # first derivation of 1/x is -1/x^2
return new
def __mul__(self,other):
new = ADNumber(self._val*other._val, name=f"{self.name}*{other.name}")
self._children.append((other._val,new))
other._children.append((self._val,new))
return new
def __add__(self,other):
if isinstance(other, (int, float)):
other = ADNumber(other, str(other))
new = ADNumber(self._val+other._val, name=f"{self.name}+{other.name}")
self._children.append((1.0,new))
other._children.append((1.0,new))
return new
def __sub__(self,other):
new = ADNumber(self._val-other._val, name=f"{self.name}-{other.name}")
self._children.append((1.0,new))
other._children.append((-1.0,new))
return new
@staticmethod
def exp(self):
new = ADNumber(math.exp(self._val), name=f"exp({self.name})")
self._children.append((self._val,new))
return new
@staticmethod
def sin(self):
new = ADNumber(math.sin(self._val), name=f"sin({self.name})")
self._children.append((math.cos(self._val),new)) # first derivative is cos
return new
def grad(self,other):
if self==other:
return 1.0
else:
result=0.0
for child in other._children:
result+=child[0]*self.grad(child[1])
return result
A = ADNumber # shortcuts
sin = A.sin
exp = A.exp
def print_childs(f, wrt): # with respect to
for e in f._children:
print("child:", wrt, "->" , e[1].name, "grad: ", e[0])
print_child(e[1], e[1].name)
x1 = A(1.5, name="x1")
x2 = A(0.5, name="x2")
f=(sin(x2)+1)/(x2+exp(x1))+x1*x2
print_childs(x2,"x2")
print("\ncalculated gradient for the function f with respect to x2:", f.grad(x2))
出去:
child: x2 -> sin(x2) grad: 0.8775825618903728
child: sin(x2) -> sin(x2)+1 grad: 1.0
child: sin(x2)+1 -> sin(x2)+1/x2+exp(x1) grad: 0.20073512936690338
child: sin(x2)+1/x2+exp(x1) -> sin(x2)+1/x2+exp(x1)+x1*x2 grad: 1.0
child: x2 -> x2+exp(x1) grad: 1.0
child: x2+exp(x1) -> sin(x2)+1/x2+exp(x1) grad: -0.05961284871202578
child: sin(x2)+1/x2+exp(x1) -> sin(x2)+1/x2+exp(x1)+x1*x2 grad: 1.0
child: x2 -> x1*x2 grad: 1.5
child: x1*x2 -> sin(x2)+1/x2+exp(x1)+x1*x2 grad: 1.0
calculated gradient for the function f with respect to x2: 1.6165488003791766
推荐阅读
- python-3.x - 如何从子列表中找到多个最小值?
- javascript - img.height 问题(每次返回 0)
- wpf - WPF window.Left +window.Width 显示错误
- docker - 尝试端口转发到 Kubernetes 时出现协议不匹配错误
- algorithm - 监督聚类 - 评估每个基本事实标签的指标?
- python-3.x - Pandas MultiIndex 多维交集
- reactjs - 带有自定义反应组件的 React-leaflet geojson onEachFeature 弹出窗口
- flutter - 无法登录用户!火力基地/颤振
- google-apps-script - 从 Bing 中提取索引页面
- typescript - 当函数返回对象时,是否有将属性 x 分配给变量 y 的简写?