首页 > 解决方案 > Keras 多输出预期形状并得到形状

问题描述

我正在尝试训练一个检测 128d 向量来识别人脸的模型。模型的输入是图像,输出是从“face_recognition”库中获取的 128d 向量(回归)。

当我将 128 个输出用于训练时,我得到了这个错误:

ValueError: Error when checking target: expected dense_24 to have shape (1,) but got array with shape (128,)

但是当我只尝试一个输出时,fit 函数就起作用了。该预测形状的奇怪部分是 (1, 128) 但我不能给出 128 的输出来训练。

这是我的模型:

from keras.applications.vgg16 import VGG16
from keras.layers import Flatten, Dense
import keras
def build_facereg_disc():
  # load model
  model = VGG16(include_top=False, input_shape=(64, 64, 3))
  # add new classifier layers
  flat1 = Flatten()(model.outputs)
  class1 = Dense(2048, activation='relu')(flat1)
  output = Dense(128, activation='relu')(class1)
  # define new model
  model = models.Model(inputs=model.inputs, outputs=output)
  # summarize
  return model

facereg_disc = build_facereg_disc()
facereg_disc.compile(optimizer=keras.optimizers.Adam(),  # Optimizer
              # Loss function to minimize
              loss=keras.losses.SparseCategoricalCrossentropy(),
              # List of metrics to monitor
              metrics=['binary_crossentropy'])

和总结:

Model: "model_27"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_20 (InputLayer)        (None, 64, 64, 3)         0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 64, 64, 64)        1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 64, 64, 64)        36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 32, 32, 64)        0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 32, 32, 128)       73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 32, 32, 128)       147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 16, 16, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 16, 16, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 16, 16, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 16, 16, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 8, 8, 256)         0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 8, 8, 512)         1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 8, 8, 512)         2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 8, 8, 512)         2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 4, 4, 512)         0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 4, 4, 512)         2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 4, 4, 512)         2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 4, 4, 512)         2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 2, 2, 512)         0         
_________________________________________________________________
flatten_10 (Flatten)         (None, 2048)              0         
_________________________________________________________________
dense_23 (Dense)             (None, 2048)              4196352   
_________________________________________________________________
dense_24 (Dense)             (None, 128)               262272    
=================================================================
Total params: 19,173,312
Trainable params: 19,173,312
Non-trainable params: 0

这是预处理:

dir_data      = "data_faces/img_align_celeba/"
Ntrain        = 2000 
Ntest         = 100
nm_imgs       = np.sort(os.listdir(dir_data))
## name of the jpg files for training set
nm_imgs_train = nm_imgs[:Ntrain]
## name of the jpg files for the testing data
nm_imgs_test  = nm_imgs[Ntrain:Ntrain + Ntest]
img_shape     = (64, 64, 3)

def get_npdata(nm_imgs_train):
    X_train = []
    for i, myid in enumerate(nm_imgs_train):
        image = load_img(dir_data + "/" + myid,
                         target_size=img_shape[:2])
        image = img_to_array(image)/255.0
        X_train.append(image)
    X_train = np.array(X_train)
    return(X_train)

X_train = get_npdata(nm_imgs_train)
X_train.shape = (2000, 64, 64, 3)
y_train.shape = (2000, 128)

我使用批量大小,如:

idx = np.random.randint(0, X_train.shape[0], half_batch)
            imgs = X_train[idx]

            labels = y_train[idx]

            reg_d_loss_real = facereg_disc.train_on_batch(imgs, labels)

标签: kerasdeep-learning

解决方案


您的问题来自您的损失函数。正如文档中所解释的,SparseCategoricalCrossentropy期望每个样本y_true是一个整数编码类,而CategoricalCrossentropy期望一个热编码表示(这是你的情况)。

所以,切换到CategoricalCrossentropy,你应该没问题。

但是,要重现,我必须更改:

flat1 = Flatten()(model.outputs)

至:

flat1 = Flatten()(model.outputs[0])

推荐阅读