python - 为什么这两行代码在 Pytorch 中给了我不同的输出,这是否解释了奇怪的参数?
问题描述
我正在使用 Pytorch 运行具有两个输入和两个输出的线性回归。我生成了一个数据集,然后添加了噪声,这样我就可以练习计算权重和偏差。这都是通过我正在参加的在线课程完成的。
模型生成的权重与我初始化模型的权重完全不同,但是偏差是相同的。这令人困惑,因为损失基本上为零 (10^-7)。
我在有问题的代码周围加上问号。生成的输出如下。如您所见,训练模型找到
'Linear.weight': tensor([[1.3696, 0.6309], [1.3159, 0.6847]]), 'Linear.bias': tensor([ 1.0000, -1.0000])
的参数与我创建数据集时使用的参数非常不同。'Linear.weight': tensor([[1.0, -1.0], [1.0, 3.0]]), 'Linear.bias': tensor([ 1.0, -1.0])
如果我运行下面的行,我会得到我试图预测的 y 值
print(data_set.y)
如果我运行下面的行,我会得到模型预测的 y 值,它与上面的输出相匹配。
print(model(data_set.x))
我的想法是下面的行将与上面的行相同,但是从输出来看,它们非常不同。我真的很困惑
print(torch.mm(data_set.x, model.state_dict()['Linear.weight']) + model.state_dict()['Linear.bias'])
import torch
import numpy as np
import matplotlib.pyplot as plt
from torch import nn,optim
from torch.utils.data import Dataset, DataLoader
torch.manual_seed(1)
# Generating the dataset
class Data(Dataset):
def __init__(self):
self.x=torch.zeros(20,2)
self.x[:,0]=torch.arange(-1,1,0.1)
self.x[:,1]=torch.arange(-1,1,0.1)
self.w=torch.tensor([ [1.0,-1.0],[1.0,3.0]])
self.b=torch.tensor([[1.0,-1.0]])
self.f=torch.mm(self.x,self.w)+self.b
self.y=self.f+0.001*torch.randn((self.x.shape[0],1))
self.len=self.x.shape[0]
def __getitem__(self,index):
return self.x[index],self.y[index]
def __len__(self):
return self.len
# creating a dataset object
data_set=Data()
# creating a linear_regression class
class linear_regression(nn.Module):
def __init__(self,input_size,output_size):
super(linear_regression,self).__init__()
self.Linear=nn.Linear(input_size,output_size)
def forward(self,x):
yhat=self.Linear(x)
return yhat
# creating a model object, an optimizer object, loss function, and dataloder
model=linear_regression(2,2)
optimizer = optim.SGD(model.parameters(), lr = 0.1)
criterion = nn.MSELoss()
train_loader=DataLoader(dataset=data_set,batch_size=5)
# List for storing loss values and creating epochs variable
LOSS=[]
epochs=100
# for loop that trains the model
for epoch in range(epochs):
for x,y in train_loader:
#make a prediction
yhat=model(x)
#calculate the loss
loss=criterion(yhat,y)
#store loss/cost
LOSS.append(loss.item())
#clear gradient
optimizer.zero_grad()
#Backward pass: compute gradient of the loss with respect to all the learnable parameters
loss.backward()
#the step function on an Optimizer makes an update to its parameters
optimizer.step()
# ?????????????????????????????????????
print("model parameters: ", model.state_dict())
print(data_set.y)
print(model(data_set.x))
print(torch.mm(data_set.x, model.state_dict()['Linear.weight']) + model.state_dict()['Linear.bias'])
# ?????????????????????????????????????
plt.plot(LOSS)
plt.xlabel("iterations ")
plt.ylabel("Cost/total loss ")
plt.show()
输出:
model parameters: OrderedDict([('Linear.weight', tensor([[1.3696, 0.6309],
[1.3159, 0.6847]])), ('Linear.bias', tensor([ 1.0000, -1.0000]))])
tensor([[-1.0015e+00, -3.0015e+00],
[-8.0075e-01, -2.8008e+00],
[-6.0065e-01, -2.6007e+00],
[-4.0161e-01, -2.4016e+00],
[-1.9913e-01, -2.1991e+00],
[ 2.4440e-04, -1.9998e+00],
[ 1.9934e-01, -1.8007e+00],
[ 4.0081e-01, -1.5992e+00],
[ 6.0044e-01, -1.3996e+00],
[ 8.0117e-01, -1.1988e+00],
[ 1.0018e+00, -9.9823e-01],
[ 1.1999e+00, -8.0010e-01],
[ 1.4001e+00, -5.9994e-01],
[ 1.5994e+00, -4.0062e-01],
[ 1.7992e+00, -2.0080e-01],
[ 1.9999e+00, -1.3162e-04],
[ 2.1992e+00, 1.9920e-01],
[ 2.4003e+00, 4.0034e-01],
[ 2.6003e+00, 6.0028e-01],
[ 2.8017e+00, 8.0172e-01]])
tensor([[-1.0005e+00, -3.0005e+00],
[-8.0048e-01, -2.8005e+00],
[-6.0043e-01, -2.6004e+00],
[-4.0037e-01, -2.4004e+00],
[-2.0031e-01, -2.2003e+00],
[-2.5582e-04, -2.0003e+00],
[ 1.9980e-01, -1.8002e+00],
[ 3.9986e-01, -1.6001e+00],
[ 5.9991e-01, -1.4001e+00],
[ 7.9997e-01, -1.2000e+00],
[ 1.0000e+00, -9.9997e-01],
[ 1.2001e+00, -7.9991e-01],
[ 1.4001e+00, -5.9986e-01],
[ 1.6002e+00, -3.9980e-01],
[ 1.8003e+00, -1.9974e-01],
[ 2.0003e+00, 3.1275e-04],
[ 2.2004e+00, 2.0037e-01],
[ 2.4004e+00, 4.0043e-01],
[ 2.6005e+00, 6.0048e-01],
[ 2.8005e+00, 8.0054e-01]], grad_fn=<AddmmBackward>)
tensor([[-1.6855, -2.3156],
[-1.4169, -2.1840],
[-1.1484, -2.0525],
[-0.8798, -1.9209],
[-0.6113, -1.7893],
[-0.3427, -1.6578],
[-0.0742, -1.5262],
[ 0.1944, -1.3947],
[ 0.4629, -1.2631],
[ 0.7315, -1.1315],
[ 1.0000, -1.0000],
[ 1.2686, -0.8684],
[ 1.5371, -0.7368],
[ 1.8057, -0.6053],
[ 2.0742, -0.4737],
[ 2.3428, -0.3422],
[ 2.6113, -0.2106],
[ 2.8799, -0.0790],
[ 3.1484, 0.0525],
[ 3.4170, 0.1841]])
解决方案
推荐阅读
- javascript - 安全的 WebSocket 通信
- spring - 配置 Spring Data REST 以允许跨域请求
- mysql - Laravel Group 按月-年按天查询结果(如果日期不存在则显示所有日期)
- javascript - 此代码将在每次 AJAX 转换后执行,使 3rd-party 插件按预期运行
- google-api - 新插入的联系人未在第一人中返回:searchContacts
- python - 如何根据高于阈值的任何列创建熊猫数据框
- c# - 错误“无法比较数组中的两个元素 - 至少一个对象必须实现 IComparable”以删除 EF Core 3.1 中的多行
- algorithm - 检查字符串中所有字符是否唯一的方法的复杂性
- c# - 如何安全地连接uri路径段?
- virtualenv - PythonVirtualenvOperator 创建的 venv 的根在哪里?