python - Keras ImageDataGenerator flow_from_dataframe 返回 KeyError
问题描述
我正在尝试使用 keras 构建图像分类器,并且我的数据集的大小要求我使用 ImageDataGenerator 类及其 flow_from_dataframe 方法。这是我正在使用的代码。
train_datagen = keras.preprocessing.image.ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
train_generator = train_datagen.flow_from_dataframe(
directory='stage_1_train_images/',
dataframe=box.drop(labels=['patientId'], axis=1).replace(to_replace=float('nan'),value=0),
target_size=(1024, 1024))
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='linear',input_shape=(28,28,1),padding='same'))
model.add(LeakyReLU(alpha=0.1))
model.add(MaxPooling2D((2, 2),padding='same'))
model.add(Conv2D(64, (3, 3), activation='linear',padding='same'))
model.add(LeakyReLU(alpha=0.1))
model.add(MaxPooling2D(pool_size=(2, 2),padding='same'))
model.add(Conv2D(128, (3, 3), activation='linear',padding='same'))
model.add(LeakyReLU(alpha=0.1))
model.add(MaxPooling2D(pool_size=(2, 2),padding='same'))
model.add(Flatten())
model.add(Dense(128, activation='linear'))
model.add(LeakyReLU(alpha=0.1))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adam(lr=1000,decay=.99),
metrics=['accuracy'])
model.fit_generator(trainGen, steps_per_epoch=1024/16, epochs=317)
但是,当我运行此代码时,出现以下错误
KeyError Traceback (most recent call last)
<ipython-input-7-5a88afda8de5> in <module>
7 directory='stage_1_train_images/',
8 dataframe=box.drop(labels=['patientId'], axis=1).replace(to_replace=float('nan'),value=0),
----> 9 target_size=(1024, 1024))
10 model = Sequential()
11 model.add(Conv2D(32, kernel_size=(3, 3),activation='linear',input_shape=(28,28,1),padding='same'))
/opt/conda/lib/python3.6/site-packages/keras_preprocessing/image.py in flow_from_dataframe(self, dataframe, directory, x_col, y_col, has_ext, target_size, color_mode, classes, class_mode, batch_size, shuffle, seed, save_to_dir, save_prefix, save_format, subset, interpolation)
1105 save_format=save_format,
1106 subset=subset,
-> 1107 interpolation=interpolation)
1108
1109 def standardize(self, x):
/opt/conda/lib/python3.6/site-packages/keras_preprocessing/image.py in __init__(self, dataframe, directory, image_data_generator, x_col, y_col, has_ext, target_size, color_mode, classes, class_mode, batch_size, shuffle, seed, data_format, save_to_dir, save_prefix, save_format, follow_links, subset, interpolation, dtype)
2056 raise ValueError("has_ext must be either True if filenames in"
2057 " x_col has extensions,else False.")
-> 2058 self.df = dataframe.drop_duplicates(x_col)
2059 self.df[x_col] = self.df[x_col].astype(str)
2060 self.directory = directory
/opt/conda/lib/python3.6/site-packages/pandas/core/frame.py in drop_duplicates(self, subset, keep, inplace)
4329 """
4330 inplace = validate_bool_kwarg(inplace, 'inplace')
-> 4331 duplicated = self.duplicated(subset, keep=keep)
4332
4333 if inplace:
/opt/conda/lib/python3.6/site-packages/pandas/core/frame.py in duplicated(self, subset, keep)
4379 diff = Index(subset).difference(self.columns)
4380 if not diff.empty:
-> 4381 raise KeyError(diff)
4382
4383 vals = (col.values for name, col in self.iteritems()
KeyError: Index(['filename'], dtype='object')
出了什么问题?我已经尝试了多种方法来解决此问题,但无法弄清楚为什么会发生这种情况。
解决方案
根据此处的文档,您需要在方法中指定x_col
和y_col
作为参数flow_from_dataframe
。x_col
和的默认值y_col
分别是“文件名”和“类”。从错误中,我猜"filename"
您的 DataFrame 中没有命名列,这就是导致KeyError
. 要解决此问题,请在方法中指定以下两个参数flow_from_dataframe
。
x_col:字符串,数据框中包含目标图像文件名的列。
y_col:字符串或字符串列表,数据框中将成为目标数据的列。
推荐阅读
- laravel - laravel hasManyDeepFromRelations 与 group by
- python - Matplotlib 条形图动画在 Jupyter Notebook 中不起作用
- python - 通过对重叠间隔求和来找到最大元素
- docker - 为什么我的 node:alpine Docker 容器中出现“curl: not found”?
- javascript - 如何使用 GraphQL 在 AWS-Amplify 中控制列表查询中的排序方向
- python - 构建 python pypi 轮子,通常的噩梦
- python-3.x - 在烧瓶中懒惰加载时如何处理kwargs
- sql - Postgresql 查询中的子查询值
- html - 无法从烧瓶中的 HTML 获取当前选定的选项
- python - 如何在 pandas 中处理带有超链接/url 的 excel 文件?