首页 > 解决方案 > 我有 2 个文件夹。一个图像在 1 个文件夹中,另一个图像在另一个文件夹中。我必须比较两张图片并找出不同之处

问题描述

我有 2 个文件夹。一个图像在 1 个文件夹中,另一个图像在另一个文件夹中。我必须比较两个图像并找到不同之处,但代码是随机文件夹编写的。

class InferenceSiameseNetworkDataset(Dataset):
    
    def __init__(self,imageFolderDataset,transform=None,should_invert=True):
        self.imageFolderDataset = imageFolderDataset    
        self.transform = transform
        self.should_invert = should_invert
        
    def __getitem__(self,index):
        img0_tuple = random.choice(self.imageFolderDataset.imgs)
        img1_tuple = random.choice(self.imageFolderDataset.imgs)
        #we need to make sure approx 50% of images are in the same class
        should_get_same_class = random.randint(0,1) 
        if should_get_same_class:
            while True:
                #keep looping till the same class image is found
                img1_tuple = random.choice(self.imageFolderDataset.imgs) 
                if img0_tuple[1]==img1_tuple[1]:
                    break
        else:
            while True:
                #keep looping till a different class image is found
                
                img1_tuple = random.choice(self.imageFolderDataset.imgs) 
                if img0_tuple[1] !=img1_tuple[1]:
                    break

        img0 = Image.open(img0_tuple[0])
        img1 = Image.open(img1_tuple[0])
        img0 = img0.convert("L")
        img1 = img1.convert("L")
        
        if self.should_invert:
            img0 = PIL.ImageOps.invert(img0)
            img1 = PIL.ImageOps.invert(img1)

        if self.transform is not None:
            img0 = self.transform(img0)
            img1 = self.transform(img1)
        
        return img0, img1 , torch.from_numpy(np.array([int(img1_tuple[1]!=img0_tuple[1])],dtype=np.float32))
    
    def __len__(self):
        return len(self.imageFolderDataset.imgs)

我从 GitHub 获取了这段代码,当我尝试比较两个图像的差异时,它是随机选择图像。输入文件夹是 2。一个图像应该在一个文件夹中,另一个图像应该在另一个文件夹中。当我尝试测试它时,它会在同一张图像上进行测试,有时我的意思是它没有检查另一个文件夹中的另一个图像。

testing_dir1 = '/content/drive/My Drive/Signature Dissimilarity/Forged_Signature_Verification/processed_dataset/training1/'
folder_dataset_test = dset.ImageFolder(root=testing_dir1)
siamese_dataset = InferenceSiameseNetworkDataset(imageFolderDataset=folder_dataset_test,
                                        transform=transforms.Compose([transforms.Resize((100,100)),
                                                                      transforms.ToTensor()
                                                                      ])
                                       ,should_invert=False)

test_dataloader = DataLoader(siamese_dataset,num_workers=6,batch_size=1,shuffle=False)
dataiter = iter(test_dataloader)
x0,_,_ = next(dataiter)

for i in range(2):
  _,x1,label2 = next(dataiter)
  concatenated = torch.cat((x0,x1),0)
  
  output1,output2 = net(Variable(x0).cuda(),Variable(x1).cuda())
  euclidean_distance = F.pairwise_distance(output1, output2)
  imshow(torchvision.utils.make_grid(concatenated),'Dissimilarity: {:.2f}'.format(euclidean_distance.item()))
  dis = 'Dissimilarity: {:.2f}'.format(euclidean_distance.item())
  dis1 = dis
  dis1 = dis1.replace("Dissimilarity:", "").replace(" ", "")
  print(dis)
  if float(dis1) < 0.5:
    print("It's Same Signature")
  else:
    print("It's Forged Signature")

标签: machine-learningpytorchartificial-intelligenceconv-neural-networksiamese-network

解决方案


只需分配自定义数据集类should_get_same_class=0__getitem__功能,InferenceSiameseNetworkDataset您就可以确保两个图像属于不同的类/文件夹。

其次,您不应连接可能不满足您条件的两批样品。您应该x0,x1,label2 = next(dataiter)在循环范围内使用,然后进行连接。


推荐阅读