pytorch: 使用nn网络进行图像分类

本文内容：

以MNIST手写体分类数据集开始；
构建一个简单的神经网络，并且追踪训练时的损失（loss）；
在Fashion MNIST上使用Lenet架构进行分类；
计算Fashion MNIST上的训练及测试环节的精度与损失；
对结果使用图进行可视化

一、以MNIST手写体分类数据集开始；

1）加载相关的包

 1 import torch
 2 import torchvision
 3 import torchvision.transforms as transforms
 4 import torch.nn as nn
 5 import torch.nn.functional as F
 6 import torch.optim as optim
 7 import numpy as np
 8 import matplotlib.pyplot as plt
 9 import time
10 from torchvision import datasets

torchvision是PyTorch中的重要模块；可以使用该模块下载数据。

torchvisoin.transorms可以帮助我们对图像像素值进行转换（正则化及标准化）

2）Transorms的定义

1 transform = transforms.Compose(
2     [transforms.ToTensor(),
3      transforms.Normalize((0.5,), (0.5,))]
4 )

代码说明：

line2 将数据集转换成tensors的形式；

line3 将数据集进行标准化。因为MNIST只有1个通道，因此是((0.5,),(0.5,))

0.5定义了每个通道的均值和标准差；前一个括号中的是均值；后一个括号为标准差；

如果是（RGB）3通道的，则需要变为Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))形式；

3）数据下载

从torchvision.datasets中下载MNIST数据集

 1 # get the data
 2 trainset = datasets.MNIST(
 3     root = './data',
 4     train = True,
 5     download = True, 
 6     transform = transform
 7 )
 8 trainloader = torch.utils.data.DataLoader(
 9     trainset, 
10     batch_size = 4,
11     shuffle = True
12 )
13 testset = datasets.MNIST(
14     root = './data',
15     train = False,
16     download = True,
17     transform = transform
18 )
19 testloader = torch.utils.data.DataLoader(
20     testset, 
21     batch_size = 4,
22     shuffle = False
23 )

代码说明：

这里的datasets.MNIST()如果查看其源码的话，可以看到里面包含了参数transform=transform，即在加载数据的时候已经对其进行了上述的转为tensor还有标准化处理。

tranloader和testloader中，batch_size=4，即每次都将处理4个数据；

MNIST数据集一共有60000个训练样本；在此batch_size下，我们将有15000个batches在trainloader中。

使用torch.util.data.DataLoader可以把数据转换为batches，这将有助于下面的操作；

4）可视化图像

取第一个batch，也就是包含4个图像，对其进行可视化：

 1 for batch_1 in trainloader:
 2     batch = batch_1
 3     break
 4 print(batch[0].shape) # as batch[0] contains the image pixels -> tensors
 5 print(batch[1]) # batch[1] contains the labels -> tensors
 6 plt.figure(figsize=(12, 8))
 7 for i in range (batch[0].shape[0]):
 8     plt.subplot(1, 4, i+1)
 9     plt.axis('off')
10     plt.imshow(batch[0][i].reshape(28, 28), cmap='gray')
11     plt.title(int(batch[1][i]))
12     plt.savefig('digit_mnist.png')
13 plt.show()

二、建一个简单的神经网络，并且追踪训练时的损失（loss）；

1）网络定义

Net()

 1 class Net(nn.Module):
 2     def __init__(self):
 3         super(Net, self).__init__()
 4         self.conv1 = nn.Conv2d(in_channels=1, out_channels=20, 
 5                                kernel_size=5, stride=1)
 6         self.conv2 = nn.Conv2d(in_channels=20, out_channels=50, 
 7                                kernel_size=5, stride=1)
 8         self.fc1 = nn.Linear(in_features=800, out_features=500)
 9         self.fc2 = nn.Linear(in_features=500, out_features=10)
10     def forward(self, x):
11         x = F.relu(self.conv1(x))
12         x = F.max_pool2d(x, 2, 2)
13         x = F.relu(self.conv2(x))
14         x = F.max_pool2d(x, 2, 2)
15         x = x.view(x.size(0), -1)
16         x = F.relu(self.fc1(x))
17         x = self.fc2(x)
18         return x

虽然这个网络很小，但是对于手写数字辨识来说已经足够了；

通常我们需要在GPU上训练我们的网络；

2）连接GPU设备：

查看设备

1 device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

加载网络到相对应的设备上

1 net = Net().to(device)

2 print(net)

输出：

Net(
  (conv1): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(20, 50, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=800, out_features=500, bias=True)
  (fc2): Linear(in_features=500, out_features=10, bias=True)
)

3）定义优化器和损失函数

分类：交叉熵

优化器：SGD()

1 # loss function
2 criterion = nn.CrossEntropyLoss()
3 # optimizer
4 optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

4）在数据上训练网络

epoch:10

batch:4

 1 def train(net):
 2     strat = time.time()
 3     for epoch in range(10):
 4         running_loss = 0.0
 5         for i, data in enumerate(trainloader,0):
 6             inputs, labels = data[0].to(device, non_blocking=True), data[1].to(device, non_blocking=True)
 7             optimizer.zero_grad()
 8             outputs = net(inputs)
 9             loss = criterion(outputs,labels)
10             loss.backward()
11             optimizer.step()
12             
13             running_loss += loss.item()
14             if i % 5000 == 4999:  # every 5000 mini batches
15                 print('[Epoch %d, %5d Mini Batches] loss: %.3f' %
16                       (epoch + 1, i + 1, running_loss/5000))
17                 running_loss = 0.0 
18     end = time.time()
19     print('Done Training')
20     print('%0.2f minutes' %((end - start) / 60))
21     
22 train(net)
23

代码说明：

我们输出每5000个Mini-batches上的损失；这将帮助我们检查网络是否真的在学习；

此外，line6显示我们将数据和标签都放在GPU上；

输出：

[Epoch 1,  5000 Mini Batches] loss: 0.303
[Epoch 1, 10000 Mini Batches] loss: 0.082
[Epoch 1, 15000 Mini Batches] loss: 0.066
[Epoch 2,  5000 Mini Batches] loss: 0.048
[Epoch 2, 10000 Mini Batches] loss: 0.042
[Epoch 2, 15000 Mini Batches] loss: 0.042
[Epoch 3,  5000 Mini Batches] loss: 0.030
[Epoch 3, 10000 Mini Batches] loss: 0.031
[Epoch 3, 15000 Mini Batches] loss: 0.027
[Epoch 4,  5000 Mini Batches] loss: 0.020
[Epoch 4, 10000 Mini Batches] loss: 0.022
[Epoch 4, 15000 Mini Batches] loss: 0.022
[Epoch 5,  5000 Mini Batches] loss: 0.015
[Epoch 5, 10000 Mini Batches] loss: 0.015
[Epoch 5, 15000 Mini Batches] loss: 0.018
[Epoch 6,  5000 Mini Batches] loss: 0.011
[Epoch 6, 10000 Mini Batches] loss: 0.012
[Epoch 6, 15000 Mini Batches] loss: 0.011
[Epoch 7,  5000 Mini Batches] loss: 0.008
[Epoch 7, 10000 Mini Batches] loss: 0.008
[Epoch 7, 15000 Mini Batches] loss: 0.010
[Epoch 8,  5000 Mini Batches] loss: 0.006
[Epoch 8, 10000 Mini Batches] loss: 0.006
[Epoch 8, 15000 Mini Batches] loss: 0.007
[Epoch 9,  5000 Mini Batches] loss: 0.004
[Epoch 9, 10000 Mini Batches] loss: 0.004
[Epoch 9, 15000 Mini Batches] loss: 0.005
[Epoch 10,  5000 Mini Batches] loss: 0.004
[Epoch 10, 10000 Mini Batches] loss: 0.002
[Epoch 10, 15000 Mini Batches] loss: 0.004
Done Training

结果表明，网络确实在不断学习

5）在测试集上验证我们的网络

 1 correct = 0
 2 total = 0
 3 with torch.no_grad():
 4     for data in testloader:
 5         inputs, labels = data[0].to(device, non_blocking=True), data[1].to(device, non_blocking=True)
 6         outputs = net(inputs)
 7         _, predicted = torch.max(outputs.data, 1)
 8         total += labels.size(0)
 9         correct += (predicted == labels).sum().item()
10 
11 print('Accuracy of the network on test images: %0.3f %%' % (
12     100 * correct / total))

输出：

Accuracy of the network on test images: 99.280 %

在测试集上进行测试；

但是这里不需要计算梯度，并且在torch.no_grad()模块中进行；

三、在Fashion MNIST上使用Lenet架构进行分类；

在上面的过程中，我们计算了损失；在此，我们将进一步计算损失、训练精度和测试精度；

我们也会对这些指标进行绘图；

1）导入包

1 import torch
2 import matplotlib.pyplot as plt
3 import numpy as np
4 import torchvision
5 import torchvision.transforms as transforms
6 import torch.nn as nn
7 import torch.nn.functional as F
8 import torch.optim as optim
9 import time

2）定义超参数

1 # define constants
2 NUM_EPOCHS = 10
3 BATCH_SIZE = 4
4 LEARNING_RATE = 0.001

3）定义数据变换

1 transform = transforms.Compose(
2     [transforms.ToTensor(),
3      transforms.Normalize((0.5,), (0.5,))])

4）加载数据

 1 trainset = torchvision.datasets.FashionMNIST(root='./data', train=True,
 2                                              download=True, 
 3                                              transform=transform)
 4 testset = torchvision.datasets.FashionMNIST(root='./data', train=False,
 5                                             download=True, 
 6                                             transform=transform)
 7 trainloader = torch.utils.data.DataLoader(trainset, batch_size=BATCH_SIZE,
 8                                           shuffle=True)
 9 testloader = torch.utils.data.DataLoader(testset, batch_size=BATCH_SIZE,
10                                           shuffle=True)

5）可视化图像：

 1 for batch_1 in trainloader:
 2     batch = batch_1
 3     break
 4 print(batch[0].shape) # as batch[0] contains the image pixels -> tensors
 5 print(batch[1].shape) # batch[1] contains the labels -> tensors
 6 plt.figure(figsize=(12, 8))
 7 for i in range (batch[0].shape[0]):
 8     plt.subplot(4, 8, i+1)
 9     plt.axis('off')
10     plt.imshow(batch[0][i].reshape(28, 28), cmap='gray')
11     plt.title(classes[batch[1][i]])
12     plt.savefig('fashion_mnist.png')
13 plt.show()

6）建立LeNet CNN网络

 1 class LeNet(nn.Module):
 2     def __init__(self):
 3         super(LeNet, self).__init__()
 4         self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, 
 5                                kernel_size=5)
 6         self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, 
 7                                kernel_size=5)
 8         self.fc1 = nn.Linear(in_features=256, out_features=120)
 9         self.fc2 = nn.Linear(in_features=120, out_features=84)
10         self.fc3 = nn.Linear(in_features=84, out_features=10)
11         
12     def forward(self, x):
13         x = F.relu(self.conv1(x))
14         x = F.max_pool2d(x, kernel_size=2)
15         x = F.relu(self.conv2(x))
16         x = F.max_pool2d(x, kernel_size=2)
17         x = x.view(x.size(0), -1)
18         x = F.relu(self.fc1(x))
19         x = F.relu(self.fc2(x))
20         x = self.fc3(x)
21         return x
22 
23 net = LeNet()
24 print(net)

输出：

LeNet(
  (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=256, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)

7）损失函数和优化器

1 # loss function and optimizer
2 loss_function = nn.CrossEntropyLoss()
3 optimizer = optim.SGD(net.parameters(), lr=LEARNING_RATE, momentum=0.9)

8）训练

1 # if GPU is available, then use GPU, else use CPU
2 device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
3 print(device)
4 net.to(device)

接下来，我们将定义一个简单但是非常重要的任务；

这将有助于我们在训练数据上计算训练精度以及在验证数据上计算验证精度；

 1 # function to calculate accuracy
 2 def calc_acc(loader):
 3     correct = 0
 4     total = 0
 5     for data in loader:
 6         inputs, labels = data[0].to(device), data[1].to(device)
 7         outputs = net(inputs)
 8         _, predicted = torch.max(outputs.data, 1)
 9         total += labels.size(0)
10         correct += (predicted == labels).sum().item()
11         
12     return ((100*correct)/total)

定义训练过程：

 1 def train():
 2     epoch_loss = []
 3     train_acc = []
 4     test_acc = []
 5     for epoch in range(NUM_EPOCHS):
 6         running_loss = 0
 7         for i, data in enumerate(trainloader, 0):
 8             inputs, labels = data[0].to(device), data[1].to(device)
 9             # set parameter gradients to zero
10             optimizer.zero_grad()
11             # forward pass
12             outputs = net(inputs)
13             loss = loss_function(outputs, labels)
14             loss.backward()
15             optimizer.step()
16             running_loss += loss.item()
17         epoch_loss.append(running_loss/15000)
18         train_acc.append(calc_acc(trainloader))
19         test_acc.append(calc_acc(testloader))
20         print('Epoch: %d of %d, Train Acc: %0.3f, Test Acc: %0.3f, Loss: %0.3f'
21               % (epoch+1, NUM_EPOCHS, train_acc[epoch], test_acc[epoch], running_loss/15000))
22         
23     return epoch_loss, train_acc, test_acc

四、计算Fashion MNIST上的训练及测试环节的精度与损失；

为了计算每轮的训练损失、训练精度和测试精度，我们需要定义三个list,epoch_loss,train_acc,test_acc

最后，返回这三个数据并进行打印；

1 start = time.time()
2 epoch_loss, train_acc, test_acc = train()
3 end = time.time()
4 print('%0.2f minutes' %((end - start) / 60))

输出：

1 Epoch: 1 of 10, Train Acc: 85.010, Test Acc: 84.390, Loss: 0.638
2 Epoch: 2 of 10, Train Acc: 87.400, Test Acc: 86.180, Loss: 0.370
3 ...
4 Epoch: 9 of 10, Train Acc: 91.867, Test Acc: 89.220, Loss: 0.226
5 Epoch: 10 of 10, Train Acc: 91.943, Test Acc: 88.920, Loss: 0.217

最后，大约有89%的测试精度；

注意，这里的测试，和我们通常说的验证集是一样的。

五、对结果使用图进行可视化

通过可视化损失和精度，我们可以对过程进行更加好的分析；

1 plt.figure()
2 plt.plot(epoch_loss)
3 plt.xlabel('Epoch')
4 plt.ylabel('Loss')
5 plt.savefig('fashion_loss.png')
6 plt.show()

每轮的损失

1 plt.figure()
2 plt.plot(train_acc)
3 plt.xlabel('Epoch')
4 plt.ylabel('Training Accuracy')
5 plt.savefig('fashion_train_acc.png')
6 plt.show()

训练精度

1 plt.figure()
2 plt.plot(test_acc)
3 plt.xlabel('Epoch')
4 plt.ylabel('Test Accuracy')
5 plt.savefig('fashion_test_acc.png')
6 plt.show()

经过10轮的迭代，损失最后大约在0.2左右；

当然，这不是最佳的结果，但是对于一个简单的网络来说，结果还可以接受；

训练精度在92%以上；

测试精度在89%左右；

如果采用更大的网络肯定会取得更好的结果。

总结与结论

希望你阅读完此文滞后，对于Pytorch有了基本你的认识。现在你可以尝试在新的数据集上进行实验。^_^

pytorch: 使用nn网络进行图像分类

推荐阅读