首页 > 技术文章 > 缺陷检测-3.CutPaste: Self-Supervised Learning for Anomaly Detection and Localization(剪切粘贴:自监督学习对于异常检测和定位)

my-love-is-python 2021-09-14 18:17 原文

 

 

Abstract

We aim at constructing a high performance model for defect detection that detects unknown anomalous patterns of an image without anomalous data. To this end, we propose a two-stage framework for building anomaly detectors using normal training data only. We first learn self-supervised deep representations and then build a generative one-class classifier on learned representations. We learn representations by classifying normal data from the CutPaste, a simple data augmentation strategy that cuts an image patch and pastes at a random location of a large image. Our empirical study on MVTec anomaly detection dataset demonstrates the proposed algorithm is general to be able to detect various types of real-world defects. We bring the improvement upon previous arts by 3.1 AUCs when learning representations from scratch. By transfer learning on pretrained representations on ImageNet, we achieve a new state-of-the art 96.6 AUC. Lastly, we extend the framework to learn and extract representations from patches to allow localizing defective areas without annotations during training。

摘要

我们旨在构造一个在缺陷检测上高表现的模型,这个模型可以检测到未知的图片上的缺陷区域不需要异常数据。最后,我们采用了二步框架去构建异常检测器只使用正常的迅雷数据。我们首先学习自监督的深度表达,然后在学习到的表达上构建一个单分类的生成器。我们通过从裁剪中分类正常的数据来学习表达,一种简单的数据增强方式, 裁剪图片的区域并且在一张大的图片上的任意位置进行粘贴。我们在MVTec异常数据的实例学习上表明这种提出的方法可以被用来去发现不同的真实缺陷。当我们从头开始迅雷时, 我们在原有的方法上提高了3.1AUC。通过在imageNet预训练的特征进行迁移学习,我们使用了全新的96.6的AUC。最后,我们扩展这个框架去学习和提取特征从补丁上, 从而允许在训练中不加标注的定位缺陷区域。

2. A Framework for Anomaly Detection

In this section, we present our anomaly detection framework for high-resolution image with defects in local regions. Following [54], we adopt a two-stage framework for building an anomaly detector, where in the first stage we learn deep representations from normal data and then construct an one-class classifier using learned representations. Subsequently, in Section 2.1, we present a novel method for learning self-supervised representations by predicting CutPaste augmentation, and extend to learning and extracting representations from local patches in Section 2.4.

异常检测的框架

在这一章节,针对局部区域存在缺陷的高分辨图片, 我们提出了缺陷检测的框架。根据54,我们采用两步框架构造一个异常检测器,第一步我们学习深度特征从一个正常数据中,然后使用学到的表征来构造一个单类型的分类器。随后,在2.1节,我们提出一个新的方法通过预测cutPaste增强来学习自监督的表征,扩展到Section2.4, 从局部patch中学习和提取表征。

 

 

这篇文章的核心点是两个

 

  第一个是CutPaste的数据增强

       第二个是使用概率密度估计(a simple parametric Gaussian density estimator (GDE))来计算异常得分

 

1.0 CutPaste的数据增强

2.1 Self-Supervised Learning with CutPaste

We conjecture that geometric transformations [20, 24, 4], such as rotations and translations, are effective in learning representation of semantic concepts (e.g., objectness), but less of regularity (e.g., continuity, repetition). As shown in Figure 2(b), anomalous patterns of defect detection typically include irregularities such as cracks (bottle, wood) or twists (toothbrush, grid). Our aim is to design an augmentation strategy creating local irregular patterns. Then we train the model to identify these local irregularity with the hope that it can generalize to unseen real defects at test time.

我们推测几何变化,像旋转和变化, 学习分割概念的表征是有效的(像物体检测), 但是有少量的规则(连续性和重复性),就像figure2中显示的,经典的异常缺陷检测模型包括不规则像裂缝(瓶子, 木材) 或者扭曲(牙刷, 网格). 我们的目标是设计一种新的数据增强策略生成不规则区域模式。然后我们训练模型来识别局部不规则,希望在测试时可以推广到没有见过的缺陷上。

To further prevent learning naive decision rules for discriminating augmented images and encouraging the model to learn to detect irregularity, we propose the CutPaste augmentation as follows:

  1. Cut a small rectangular area of variable sizes and aspect ratios from a normal training image.

  2. Optionally, we rotate or jitter pixel values in the patch.

  3. Paste a patch back to an image at a random location.

为了进一步防止判别增强图片时学习不成熟的决策规则并且鼓励模型去学习,我们提出剪切粘贴增强如下: 

      1. 从一张正常的训练数据, 剪切一个可变尺寸和长宽比的小矩形区域。

      2. 随意地,在一个patch内,我们旋转和像素抖动像素值

      3.将补丁粘贴回一个随机图片上的一个位置

 

CutPasteNormal的代码说明: 

h = img.size[0]
w = img.size[1]
        
# ratio between area_ratio[0] and area_ratio[1]
# 1.图片裁剪
ratio_area = random.uniform(self.area_ratio[0], 
    self.area_ratio[1]) * w * h # 在[0.02, 0.15]之间生成一个随机返回值
        
# sample in log space
log_ratio = torch.log(torch.tensor((self.aspect_ratio,         
    1/self.aspect_ratio))) # log0.3 log(1/0.3)
aspect = torch.exp(
    torch.empty(1).uniform_(log_ratio[0], log_ratio[1]) # 在这两个之间生成随机数
        ).item()
        
cut_w = int(round(math.sqrt(ratio_area * aspect)))
cut_h = int(round(math.sqrt(ratio_area / aspect)))
        
# one might also want to sample from other images. currently we only sample from the image itself
from_location_h = int(random.uniform(0, h - cut_h)) # 初始化位置
from_location_w = int(random.uniform(0, w - cut_w))
        
box = [from_location_w, from_location_h, from_location_w + cut_w, from_location_h + cut_h]
    
patch = img.crop(box) # 进行图片裁剪

patch.show()

# 2.颜色的随机抖动
if self.colorJitter:
patch = self.colorJitter(patch)

# 3.粘贴到随机的位置上
to_location_h = int(random.uniform(0, h - cut_h))
to_location_w = int(random.uniform(0, w - cut_w))
        
insert_box = [to_location_w, to_location_h, to_location_w + cut_w, to_location_h + cut_h]
augmented = img.copy()
augmented.paste(patch, insert_box)

augmented.show() # 进行图片的粘贴

 

 

 

We show the CutPaste augmentation process in the orange dotted box of Figure 1 and more examples in Figure 2(e). Following the idea of rotation prediction [19], we define the training objective of the proposed self-supervised representation learning as follows:

我们展示这个CutPaste数据增强的过程在黄色的框内,更多的例子在Figure2(e).遵循旋转预测的思想,我们将提出自监督表示学习训练检测定义如下

where X is the set of normal data, CP(·) is a CutPaste augmentation and g is a binary classifier parameterized by deep networks. CE(·, ·) refers to a cross-entropy loss. In practice, data augmentations, such as translation or color jitter, are applied before feeding x into g or CP.

X是一个正常的数据集, CP是一个CutPaste的数据增强和G是一个二分类参数的深度网络.CE(,)为交叉熵损失函数.实际上,数据增强,像平移和颜色抖动,将喂x给g或者CP之前使用

 

y = torch.arange(len(xs), device=device)
y = y.repeat_interleave(xs[0].size(0))
loss = loss_fn(logits, y)

 

2.2. CutPaste Variants

Multi-Class Classification. While CutPaste (large patch) and CutPaste-Scar share a similarity, the shapes of an image patch of two augmentations are very different. Empirically, they have their own advantages on different types of defects. To leverage the strength of both scales in the training, we formulate a finer-grained 3-way classification task among normal, CutPaste and CutPaste-Scar by treating CutPaste variants as two separate classes. Detailed study will be presented in Section 5.2.

多类别分类. 当复制粘贴(巨大的补丁)和复制粘贴伤疤共享同一个相似点,两种增强的图像块的形状是不同的。经验的,我们有我们自己的优势在不同的缺陷类型上。在训练过程中去利用这两种天平的力量,我们制定了一个更加细粒度的3分类模型,CutPaste和CutPaste-Scar不同的变体作为两个分离的类别。细节将在5.2节显示。

关于CutPaste的代码已经在上面给出,关于CutPaste-Scar的代码如下, 由于图片较小,粘贴的位置比较随机,因此很有可能粘贴到目标图像外了

h = img.size[0]
w = img.size[1]
        
# cut region 1.裁剪图片
cut_w = random.uniform(*self.width)
cut_h = random.uniform(*self.height)
        
from_location_h = int(random.uniform(0, h - cut_h))
from_location_w = int(random.uniform(0, w - cut_w))
        
box = [from_location_w, from_location_h, from_location_w + cut_w, from_location_h + cut_h]
patch = img.crop(box)
        
# 2.进行颜色抖动和角度抖动
if self.colorJitter:
   patch = self.colorJitter(patch)

# rotate
rot_deg = random.uniform(*self.rotation)
patch = patch.convert("RGBA").rotate(rot_deg,expand=True)
        
#paste 3.进行图片的粘贴 
to_location_h = int(random.uniform(0, h - patch.size[0]))
to_location_w = int(random.uniform(0, w - patch.size[1]))

mask = patch.split()[-1]
patch = patch.convert("RGB")
        
augmented = img.copy()
augmented.paste(patch, (to_location_w, to_location_h), mask=mask)

augmented.show()

2.0 使用概率密度估计(a simple parametric Gaussian density estimator (GDE))来计算异常得分,评估异常位置

There exist various ways to compute anomaly scores via one-class classifiers. In this work, we build generative classifiers like kernel density estimator [52] or Gaussian density estimator [43], on representations f. Below, we explain how to compute anomaly scores and the trade-offs. Although nonparametric KDE is free from distribution assumptions, it requires many examples for accurate estimation [58] and could be computationally expensive. With limited normal training examples for defect detection, we consider a simple parametric Gaussian density estimator (GDE) whose log-density is computed as follows:

通过一个类别去计算异常得分的方法有多种。在这个工作中,我们建立了一个生成分类器像核密度估计或者高斯密度估计。在表现f,如下,我们解释如果去计算异常得分并且权衡。尽管非参数KDE不受分布假设的约束,对于正确估计,它需要多个exmaples并且计算昂贵。用有限的常规训练实例进行缺陷检测,我们认为一个简单的高斯密度估计(GDE),密度测井如下

 

params = {'bandwidth': np.logspace(-10, 10, 50)}
grid = GridSearchCV(KernelDensity(), params)
grid.fit(embeds)

print("best bandwidth: {0}".format(grid.best_estimator_.bandwidth))

# # use the best estimator to compute the kernel density estimate
kde = grid.best_estimator_
kde = KernelDensity(kernel='gaussian', bandwidth=1).fit(train_embed)
scores = kde.score_samples(embeds)
print(scores)

 

 

 

 

推荐阅读