python - model.parameters() 没有在 Pytorch 的线性回归中更新
问题描述
我是 Pytorch 深度学习的新手。我在这里使用 Kaggle 的房价数据集。我尝试对前 50 行进行采样。但是在我进行训练时,model.parameters() 并没有更新。任何人都可以帮忙吗?
import torch
import numpy as np
from torch.utils.data import TensorDataset
import torch.nn as nn
from torch.utils.data import DataLoader
import torch.nn.functional as F
inputs = np.array(label_X_train[:50])
targets = np.array(train_y[:50])
# Tensors
inputs = torch.from_numpy(inputs)
targets = torch.from_numpy(targets)
targets = targets.view(-1, 1)
train_ds = TensorDataset(inputs, targets)
batch_size = 5
train_dl = DataLoader(train_ds, batch_size, shuffle=True)
model = nn.Linear(10, 1)
# Define Loss func
loss_fn = F.mse_loss
# Optimizer
opt = torch.optim.SGD(model.parameters(), lr = 1e-5)
num_epochs = 100
model.train()
for epoch in range(num_epochs):
# Train with batches of data
for xb, yb in train_dl:
# 1. Generate predictions
pred = model(xb.float())
# 2. Calculate loss
loss = loss_fn(pred, yb.float())
# 3. Compute gradients
loss.backward()
# 4. Update parameters using gradients
opt.step()
# 5. Reset the gradients to zero
opt.zero_grad()
if (epoch+1) % 10 == 0:
print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch +
1, num_epochs,
loss.item()))
解决方案
重量确实会更新,但您没有正确捕获它。model.weight.data
是一个火炬张量,但变量的名称只是一个引用,所以设置w = model.weight.data
不会创建一个副本而是对对象的另一个引用。因此改变model.weight.data
也会改变w
。
因此,通过在循环的不同部分设置w = model.weight.data
和w_new = model.weight data
,意味着您将两个引用分配给同一个对象,使它们的值始终相等。
为了评估模型权重在print(model.weight.data)
循环之前和之后的变化(因为你有一个包含 10 个参数的线性层,仍然可以这样做)或简单地设置w = model.weight.data.clone()
. 在这种情况下,您的输出将是:
tensor([[False, False, False, False, False, False, False, False, False, False]])
这是一个示例,显示您的权重正在发生变化:
import torch
import numpy as np
from torch.utils.data import TensorDataset
import torch.nn as nn
from torch.utils.data import DataLoader
import torch.nn.functional as F
inputs = np.random.rand(50, 10)
targets = np.random.randint(0, 2, 50)
# Tensors
inputs = torch.from_numpy(inputs)
targets = torch.from_numpy(targets)
targets = targets.view(-1, 1)
train_ds = TensorDataset(inputs, targets.squeeze())
batch_size = 5
train_dl = DataLoader(train_ds, batch_size, shuffle=True)
model = nn.Linear(10, 1)
# Define Loss func
loss_fn = F.mse_loss
# Optimizer
opt = torch.optim.SGD(model.parameters(), lr = 1e-1)
num_epochs = 100
model.train()
w = model.weight.data.clone()
for epoch in range(num_epochs):
# Train with batches of data
for xb, yb in train_dl:
# 1. Generate predictions
pred = model(xb.float())
# 2. Calculate loss
loss = loss_fn(pred, yb.float())
# 3. Compute gradients
loss.backward()
# 4. Update parameters using gradients
opt.step()
# 5. Reset the gradients to zero
opt.zero_grad()
if (epoch+1) % 10 == 0:
print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch +
1, num_epochs,
loss.item()))
print(w == model.weight.data)
推荐阅读
- list - 创建所有可能列表的列表,假设每个元素可以采用 n 个值之一
- sql - 获取一起购买的商品 + 最常一起购买的商品
- java - 用户定义的程序来计算字段
- python - 问题:如何用beautifulsoup获取标签属性值列表
- python - Pickle 仅保存几对用户输入列表中的最后一对输入
- yugabyte-db - 如何配置 yb-cdc-connector 以在 yugabyte db 重新启动(停止和启动)时恢复打印日志?
- python - 在像素级别更改文本的背景
- git - Docker 多阶段构建 - 排除 .git 文件夹
- python-3.x - BadFileException on Reading step of gcloud app deploy a python Flask application
- sql - 在Oracle中根据多组7个数字范围计算计数