Pytorch中torch.nn使用

本文将介绍：

torch.nn包
定义一个简单的nn架构
定义优化器、损失函数
梯度的反向传播

将使用LeNet-5架构进行说明

一、torch.nn包

torch.nn包来构建网络；

torch.nn.Module类作为自定义类的基类；

nn.Module，包含了所有神经网络层，比如卷积层或者是线性层；

torch.nn.Functional包，可以定义在前向传播的时候的运算；比如，卷积、dropout以及激活函数

二、定义NN

我们将使用上述的类和包来定义NN

1 import torch
2 import torch.nn as nn
3 import torch.nn.functional as F

接下来，我们将使用torch.nn.Module来构建我们的NN

请记住，我们将尝试构造出Lenet-5架构，代码如下：

代码详解：

line3 在__init__函数中，我们首先调用了super()函数。这确保了我们继承了nn.Module中的所有方法。现在我们就可以使用nn.Module中所有的方法和层。

从line4开始，网络开始构建。

原始数据输入 1, 32*32

首先有一个Conv2d(1, 6, (5, 5))卷基层；Conv2d参数为（输入的通道，输出通道，（核*核））；6, 28*28

池化后：6，14*14

ConvConv2d(6, 16, (5, 5))：输出： 16，10*10

池化后：16， 5*5

拉平：16*5*5=400

最后经过三个线性层，输出10个维度的特征

需要注意的是：我们旨在__init__()中定义层，所有的操作都是在forward中进行的。

 1 class Net(nn.Module):
 2     def __init__(self):
 3         super(Net, self).__init__()
 4         self.conv1 = nn.Conv2d(1, 6, (5, 5)) # (input image channel, output channels, kernel size)
 5         self.conv2 = nn.Conv2d(6, 16, (5, 5))
 6         # in linear layer (output channels from conv2d x width x height)
 7         self.fc1 = nn.Linear(16 * 5 * 5, 120) # (in_features, out_features)
 8         self.fc2 = nn.Linear(120, 84)
 9         self.fc3 = nn.Linear(84, 10)
10     def forward(self, x):
11         x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
12         x = F.max_pool2d(F.relu(self.conv2(x)), (2, 2))
13         x = x.view(x.size(0), -1) ##这里卷积完成后，需要对其展平
14         x = F.relu(self.fc1(x)) #调用激活函数
15         x = F.relu(self.fc2(x))
16         x = self.fc3(x)
17         return x
18 model = Net()  #实例化一个NN类
19 print(model)

output：

1 Net(
2   (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
3   (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
4   (fc1): Linear(in_features=400, out_features=120, bias=True)
5   (fc2): Linear(in_features=120, out_features=84, bias=True)
6   (fc3): Linear(in_features=84, out_features=10, bias=True)
7 )

三、定义优化器和损失函数

为了定义优化器，我们需要输入torch.optim，

优化器具有很多选择，如RMSprop， Adam, SGD, Adadelta等

1 # loss function and optimizer
2 import torch.optim as optim
3 loss_function = nn.MSELoss()
4 optimizer = optim.RMSprop(model.parameters(), lr=0.001)

optmizer接收两个参数，第一个是我们定义的模型参数，第二个是学习率。

四、输入的数据和反向传播

在此，我们生成随机数据对网络进行实现。

1 input = torch.rand(1, 1, 32, 32)
2 out = model(input)
3 print(out, out.shape)

输出：

1 tensor([[-0.0265, -0.0181,  0.1301,  0.0465,  0.0697,  0.0765, -0.0022,  0.0215,
           0.0908, -0.1489]], grad_fn=AddmmBackward) torch.Size([1, 10])

同样的，如果我们需要定义一个真实的目标（label），因为我们需要计算损失。

在我们定义好我们的真实目标数据后，我们需要将其reshape成为[1,10]。

这是因为必须和输出的形状是相同的才可以计算损失。

1 # dummy targets
2 labels = torch.rand(10)
3 labels = labels.view(1, -1) # resize for the same shape as output
4 print(labels, labels.shape)

输出：

1 tensor([[0.7737, 0.7730, 0.1154, 0.4876, 0.5071, 0.3506, 0.3078, 0.4576, 0.0926,
2          0.1268]]) torch.Size([1, 10])

五、反向传播

1 loss = loss_function(out, labels)
2 loss.backward()
3 print(loss)

输出：

tensor(0.2090, grad_fn=MseLossBackward)

需要注意的是，我们现在只有一个输入。

实际上，我们将在train循环使用上面的代码。每次循环的时候需要将梯度设为0.

梯度设为0的操作，可以通过optimizer.zero_grad()来实现；

Pytorch中torch.nn使用

推荐阅读