首页 > 解决方案 > 使用 Keras 进行手势识别,无法验证验证准确性

问题描述

我正在尝试从实时提要(opencv)中实现对简单手势(手语编号 0-4)的检测,并在100x100 图像数据集上训练我的模型。使用以下配置,它在验证期间平均达到约 80-85% 的分类准确度:

model = Sequential()

model.add(Conv2D(filters = 32, kernel_size = (5,5),padding = 'Same',activation ='relu', input_shape = (100,100,3)))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Conv2D(filters = 64, kernel_size = (3,3),padding = 'Same',activation ='relu'))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))

model.add(Dropout(.25))
model.add(Flatten())
model.add(Dense(512))

model.add(Dropout(.25))
model.add(Dense(512))

model.add(Dense(5, activation = "softmax"))

model.compile(loss="categorical_crossentropy", optimizer=Adam(learning_rate=0.001), metrics=["categorical_accuracy"])

model.summary()

history = model.fit(X_train, y_train, batch_size = 32, epochs=25, validation_data=(X_test,y_test))
Epoch 1/20
26/26 [==============================] - 6s 226ms/step - loss: 612.0366 - categorical_accuracy: 0.3471 - val_loss: 4.6006 - val_categorical_accuracy: 0.4369
Epoch 2/20
26/26 [==============================] - 5s 206ms/step - loss: 1.9365 - categorical_accuracy: 0.5473 - val_loss: 1.0971 - val_categorical_accuracy: 0.5922
Epoch 3/20
26/26 [==============================] - 5s 188ms/step - loss: 0.6555 - categorical_accuracy: 0.7767 - val_loss: 0.7563 - val_categorical_accuracy: 0.7573
Epoch 4/20
26/26 [==============================] - 5s 190ms/step - loss: 0.3508 - categorical_accuracy: 0.8932 - val_loss: 0.7111 - val_categorical_accuracy: 0.8010
Epoch 5/20
26/26 [==============================] - 5s 185ms/step - loss: 0.1850 - categorical_accuracy: 0.9454 - val_loss: 0.7101 - val_categorical_accuracy: 0.8107
Epoch 6/20
26/26 [==============================] - 5s 183ms/step - loss: 0.1128 - categorical_accuracy: 0.9660 - val_loss: 0.7007 - val_categorical_accuracy: 0.8204
Epoch 7/20
26/26 [==============================] - 5s 184ms/step - loss: 0.0688 - categorical_accuracy: 0.9806 - val_loss: 0.7460 - val_categorical_accuracy: 0.8155
Epoch 8/20
26/26 [==============================] - 5s 192ms/step - loss: 0.0450 - categorical_accuracy: 0.9867 - val_loss: 0.7724 - val_categorical_accuracy: 0.8155
Epoch 9/20
26/26 [==============================] - 5s 197ms/step - loss: 0.0321 - categorical_accuracy: 0.9951 - val_loss: 0.7884 - val_categorical_accuracy: 0.8301
Epoch 10/20
26/26 [==============================] - 5s 200ms/step - loss: 0.0275 - categorical_accuracy: 0.9939 - val_loss: 0.8498 - val_categorical_accuracy: 0.8301
Epoch 11/20
26/26 [==============================] - 5s 197ms/step - loss: 0.0258 - categorical_accuracy: 0.9964 - val_loss: 0.7994 - val_categorical_accuracy: 0.8447
Epoch 12/20
26/26 [==============================] - 5s 186ms/step - loss: 0.0191 - categorical_accuracy: 0.9964 - val_loss: 0.8782 - val_categorical_accuracy: 0.8204
Epoch 13/20
26/26 [==============================] - 5s 181ms/step - loss: 0.0240 - categorical_accuracy: 0.9939 - val_loss: 0.8985 - val_categorical_accuracy: 0.8301
Epoch 14/20
26/26 [==============================] - 5s 199ms/step - loss: 0.0268 - categorical_accuracy: 0.9915 - val_loss: 0.9292 - val_categorical_accuracy: 0.8155
Epoch 15/20
26/26 [==============================] - 5s 197ms/step - loss: 0.0291 - categorical_accuracy: 0.9927 - val_loss: 0.8743 - val_categorical_accuracy: 0.8301
Epoch 16/20
26/26 [==============================] - 5s 200ms/step - loss: 0.0227 - categorical_accuracy: 0.9927 - val_loss: 0.8598 - val_categorical_accuracy: 0.8447
Epoch 17/20
26/26 [==============================] - 5s 200ms/step - loss: 0.0079 - categorical_accuracy: 1.0000 - val_loss: 0.8670 - val_categorical_accuracy: 0.8398
Epoch 18/20
26/26 [==============================] - 5s 187ms/step - loss: 0.0130 - categorical_accuracy: 0.9964 - val_loss: 0.8618 - val_categorical_accuracy: 0.8447
Epoch 19/20
26/26 [==============================] - 5s 183ms/step - loss: 0.0085 - categorical_accuracy: 0.9976 - val_loss: 0.8948 - val_categorical_accuracy: 0.8398
Epoch 20/20
26/26 [==============================] - 5s 193ms/step - loss: 0.0096 - categorical_accuracy: 0.9976 - val_loss: 0.8915 - val_categorical_accuracy: 0.8350

为了避免过度拟合,我添加了两个 dropout 层并尝试了较低的 epoch 计数,但无论哪种方式,我似乎都无法成功地预测来自同一验证集的实时输入或手动输入,其准确度类似于提到的 80+%。

通过对灰度转换的图像应用 Otsu 的二值化并使用 sklearn 的分割对数据进行预处理train_test_split

x_data = np.array(x_data, dtype = 'float32')
y_data = np.array(y_data)

# Convert to 1-hot encoding
n_values = np.max(y_data) + 1
onehot = np.eye(n_values)[y_data]

X_train, X_test, y_train, y_test = train_test_split(x_data, onehot,
                                                    test_size = 0.2,
                                                    random_state = 1)

但是,当尝试手动从验证集中预测图像以进行验证时,预测通常会是极低的值,最高的值很少与实际标签匹配(使用字典进行反向查找)。

有门槛的手图片

img = cv2.imread('../sample.jpg', 0)

img_array = np.array(img)
img_array = img_array.reshape(-1, 100, 100, 1)
prediction = loaded_model(img_array)
prediction = np.argmax(prediction[0])
print(reverselookup[prediction])
tf.Tensor([[4.3481304e-08 4.1961679e-01 5.8038312e-01 1.4601602e-34 8.9533229e-15]]
three

这让我相信预处理存在一些问题,因为一些验证数据在光照水平上也不同,并且可能不是最佳阈值。从相机拍摄图像时,另一个问题似乎是分辨率差异(从 640x480 缩小)。由于我对 ML 比较陌生,因此我非常感谢任何线索。

(有问题的?)背景照明示例:

背景照明

标签: pythontensorflowopencvmachine-learningkeras

解决方案


推荐阅读