首页 > 解决方案 > 如何将 EMNIST 字母从文件导入 Keras

问题描述

我正在尝试将EMNIST Letters数据集导入我创建的人工智能程序(用 python 编写),但似乎无法正确执行。我应该如何将其导入以下程序?

...
# Import Statements
...


emnist = spio.loadmat("EMNIST/emnist-letters.mat")
...

# The problems appear to originate below--I am trying to set these variables to the corresponding parts of the EMNIST dataset and cannot succeed

x_train = emnist["dataset"][0][0][0][0][0][0]
x_train = x_train.astype(np.float32)

y_train = emnist["dataset"][0][0][0][0][0][1]

x_test = emnist["dataset"][0][0][1][0][0][0]
x_test = x_test.astype(np.float32)

y_test = emnist["dataset"][0][0][1][0][0][1]

train_labels = y_train
test_labels = y_test

x_train /= 255
x_test /= 255

x_train = x_train.reshape(x_train.shape[0], 1, 28, 28, order="A")
x_test = x_test.reshape(x_test.shape[0], 1, 28, 28, order="A")

y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)

# Does not work:
plt.imshow(x_train[54000][0], cmap='gray')
plt.show()

# Compilation and Fitting
...

我根本没想到会出现错误消息,但收到了:

Traceback (most recent call last):
  File "OCIR_EMNIST.py", line 61, in <module>
    y_train = keras.utils.to_categorical(y_train, 10)
  File "/home/user/.local/lib/python3.7/site-packages/keras/utils/np_utils.py", line 34, in to_categorical
    categorical[np.arange(n), y] = 1
IndexError: index 23 is out of bounds for axis 1 with size 10

修正:MNIST 数据集不适合本项目,因为它不包含手写字母;它只包含手写数字。

标签: pythontensorflowmachine-learningkerasmnist

解决方案


也许你应该看看:https ://github.com/christianversloot/extra_keras_datasets

它不是一个流行的库(在撰写本文时),我还没有尝试过,但是,它似乎很容易使用,并且有据可查。

要使用它加载 EMNIST 数据集,您可以像使用 Keras 一样执行此操作

from extra_keras_datasets import emnist
(input_train, target_train), (input_test, target_test) = emnist.load_data(type='balanced')

推荐阅读