首页 > 解决方案 > pytorch CNN 获取单个图像的标签

问题描述

我被困在一个应该预测单个图像标签的函数上。我需要在单个图像上执行此操作,因为我想构建一个 Web 应用程序,用户可以在其中上传图像并获得其预测。

我的 CNN 是以下模型的基础:

class ImageClassificationBase(nn.Module):
    def training_step(self, batch):
        images, labels = batch 
        out = self(images)                  # Generate predictions
        loss = F.cross_entropy(out, labels) # Calculate loss
        return loss

    def validation_step(self, batch):
        images, labels = batch 
        out = self(images)                    # Generate predictions
        loss = F.cross_entropy(out, labels)   # Calculate loss
        acc = accuracy(out, labels)           # Calculate accuracy
        return {'val_loss': loss.detach(), 'val_acc': acc}
    
    def validation_epoch_end(self, outputs):
        batch_losses = [x['val_loss'] for x in outputs]
        epoch_loss = torch.stack(batch_losses).mean()   # Combine losses
        batch_accs = [x['val_acc'] for x in outputs]
        epoch_acc = torch.stack(batch_accs).mean()      # Combine accuracies
        return {'val_loss': epoch_loss.item(), 'val_acc': epoch_acc.item()}

    def epoch_end(self, epoch, result):
        print("Epoch [{}], train_loss: {:.4f}, val_loss: {:.4f}, val_acc: {:.4f}".format(
            epoch, result['train_loss'], result['val_loss'], result['val_acc']))

和模型本身:

class BrainTumorClassification(ImageClassificationBase):
    def __init__(self):
        super().__init__()
        self.network = nn.Sequential(
            nn.Conv2d(3, 32, kernel_size = 3, padding = 1),
            nn.ReLU(),
            nn.Conv2d(32,64, kernel_size = 3, stride = 1, padding = 1),
            nn.ReLU(),
            nn.MaxPool2d(2,2),
        
            nn.Conv2d(64, 128, kernel_size = 3, stride = 1, padding = 1),
            nn.ReLU(),
            nn.Conv2d(128 ,128, kernel_size = 3, stride = 1, padding = 1),
            nn.ReLU(),
            nn.MaxPool2d(2,2),
            
            nn.Conv2d(128, 256, kernel_size = 3, stride = 1, padding = 1),
            nn.ReLU(),
            nn.Conv2d(256,256, kernel_size = 3, stride = 1, padding = 1),
            nn.ReLU(),
            nn.MaxPool2d(2,2),
            
            nn.Flatten(),
            nn.Linear(82944,1024),
            nn.ReLU(),
            nn.Linear(1024, 512),
            nn.ReLU(),
            nn.Linear(512,6))
    
    def forward(self, xb):
        return self.network(xb)

我尝试实现的用于测试单个图像的功能如下:

from torch.autograd import Variable
transformer = transforms.Compose([
     transforms.Resize((150,150)), transforms.ToTensor()])
def classify(image_path,image_transforms, classes):
    image = Image.open(image_path)
    image_tensor = image_transforms(image).float()
    image_tensor = image_tensor.unsqueeze_(0)
    input = Variable(image_tensor)
    output = model(input)
    index = output.data.numpy().argmax()
    pred = classes[index]
    return pred

我收到一个错误:

`pred=classes[index]` index out of range

我应该提到类有4元素:['glioma_tumor', 'meningioma_tumor', 'no_tumor', 'pituitary_tumor']

标签: pytorchconv-neural-network

解决方案


需要注意的几点:

  • 不要忘记在初始化的model.
  • Variable已被弃用,您不应该使用它。requires_grad梯度在有标志的张量上被跟踪。在这里,您只是在推断,因此您实际上可以使用torch.no_grad上下文来避免保留参数激活。这将提高推理速度。
  • torch.Tensor.unsqueeze_,您不必重新分配结果,因为输入本身已被函数修改。作为一般说明,所有torch.Tensor_后缀的函数都是就地运算符。
  • 最重要的是,您提到只有4类,但您的最后一个全连接层输出6logits。在这种情况下,您需要将其更改为4.

这是一个可能的修改:

transformer = transforms.Compose([transforms.Resize((150,150)),
                                  transforms.ToTensor()])

@torch.no_grad()
def classify(image_path,image_transforms, classes):
    image = Image.open(image_path)
    image_tensor = image_transforms(image)
    image_tensor.unsqueeze_(0)
    output = model(image_tensor)
    index = output.data.numpy().argmax()
    pred = classes[index]
    return pred

推荐阅读