首页 > 解决方案 > 这种自动编码器架构的损失(binary_crossentropy)停滞在 0.601 左右

问题描述

我正在研究一个无监督的图像分类问题,数据集包含大约 4700 张食肉动物的照片。我想通过构建一个自动编码器并获取图像嵌入,然后应用余弦相似度来完成这项任务。我没有得到太大的改善。这是我的自动编码器架构:

Model: "functional_75"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_19 (InputLayer)        [(None, 128, 128, 3)]     0         
_________________________________________________________________
conv2d_126 (Conv2D)          (None, 128, 128, 64)      1792      
_________________________________________________________________
max_pooling2d_54 (MaxPooling (None, 64, 64, 64)        0         
_________________________________________________________________
conv2d_127 (Conv2D)          (None, 64, 64, 32)        18464     
_________________________________________________________________
max_pooling2d_55 (MaxPooling (None, 32, 32, 32)        0         
_________________________________________________________________
conv2d_128 (Conv2D)          (None, 32, 32, 16)        4624      
_________________________________________________________________
max_pooling2d_56 (MaxPooling (None, 16, 16, 16)        0         
_________________________________________________________________
conv2d_129 (Conv2D)          (None, 16, 16, 16)        2320      
_________________________________________________________________
up_sampling2d_54 (UpSampling (None, 32, 32, 16)        0         
_________________________________________________________________
conv2d_130 (Conv2D)          (None, 32, 32, 32)        4640      
_________________________________________________________________
up_sampling2d_55 (UpSampling (None, 64, 64, 32)        0         
_________________________________________________________________
conv2d_131 (Conv2D)          (None, 64, 64, 64)        18496     
_________________________________________________________________
conv2d_132 (Conv2D)          (None, 64, 64, 3)         1731      
_________________________________________________________________
up_sampling2d_56 (UpSampling (None, 128, 128, 3)       0         
=================================================================
Total params: 52,067
Trainable params: 52,067
Non-trainable params: 0
_________________________________________________________________

请提出一些改进建议。

标签: deep-learningcomputer-visionconv-neural-networkautoencoderunsupervised-learning

解决方案


推荐阅读