首页 > 解决方案 > Python KeyError:对于 flow_from_dataframe 中的 x_col 值

问题描述

我有一个数据集,其中单独给出图像文件,并在单独的 csv 文件中给出该图像文件的标签,第一列作为图像文件名,第二列是其各自的标签。我的代码如下。

import pandas as pd
train= pd.read_csv('/content/drive/MyDrive/Colab_Notebooks/label_train.csv',dtype=str)
train.head()

number;label
0   101.jpg;3
1   102.jpg;1
2   103.jpg;3
3   104.jpg;3
4   105.jpg;2

test = pd.read_csv('/content/drive/MyDrive/Colab_Notebooks/label_test.csv',dtype=str)
test.head()

number;label
0   201.jpg;3
1   202.jpg;3
2   203.jpg;1
3   204.jpg;3
4   205.jpg;3

train_folder = '/content/drive/MyDrive/Colab_Notebooks/bilder_train'
test_folder = '/content/drive/MyDrive/Colab_Notebooks/bilder_test'

import os
import numpy as np
import glob
import shutil
import matplotlib.pyplot as plt
import tensorflow as tf

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Conv2D, Flatten, Dropout, MaxPooling2D, BatchNormalization
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from keras import regularizers, optimizers

train_gen = ImageDataGenerator(rescale=1./255, 
                               rotation_range=45, 
                               width_shift_range=.15, 
                               height_shift_range=.15, 
                               horizontal_flip=True, 
                               zoom_range=0.5)
test_gen = ImageDataGenerator(rescale=1./255)

train_data = train_gen.flow_from_dataframe(dataframe = train, 
                                           directory = train_folder, 
                                           x_col = 'number', 
                                           y_col = 'label', 
                                           seed = 42, 
                                           batch_size = 10, 
                                           shuffle = True, 
                                           class_mode='categorical',
                                           target_size = (100, 100))

test_data = test_gen.flow_from_dataframe(dataframe = test, 
                                         directory = test_folder, 
                                         x_col = 'number', 
                                         y_col = None, 
                                         seed = 42, 
                                         batch_size = 10, 
                                         shuffle = False, 
                                         class_mode='categorical', 
                                         target_size = (100, 100))

这是错误消息

KeyError                                  Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2897             try:
-> 2898                 return self._engine.get_loc(casted_key)
   2899             except KeyError as err:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'number'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
6 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2898                 return self._engine.get_loc(casted_key)
   2899             except KeyError as err:
-> 2900                 raise KeyError(key) from err
   2901 
   2902         if tolerance is not None:

KeyError: 'number'

我完全不知道为什么会发生此错误。有人知道这里发生了什么吗?

标签: pythondataframetensorflowkeyerrorimage-classification

解决方案


您需要添加sep=;(CSV 分隔符)添加pd.read_csv函数的末尾。由于它的默认sep值是,这样,它将解释number;label为单个列而不是 2 个单独的列

import pandas as pd
train= pd.read_csv('/content/drive/MyDrive/Colab_Notebooks/label_train.csv',dtype=str, sep=';')
train.head()

test = pd.read_csv('/content/drive/MyDrive/Colab_Notebooks/label_test.csv',dtype=str, sep=';')
test.head()

推荐阅读