python - Python KeyError:对于 flow_from_dataframe 中的 x_col 值
问题描述
我有一个数据集,其中单独给出图像文件,并在单独的 csv 文件中给出该图像文件的标签,第一列作为图像文件名,第二列是其各自的标签。我的代码如下。
import pandas as pd
train= pd.read_csv('/content/drive/MyDrive/Colab_Notebooks/label_train.csv',dtype=str)
train.head()
number;label
0 101.jpg;3
1 102.jpg;1
2 103.jpg;3
3 104.jpg;3
4 105.jpg;2
test = pd.read_csv('/content/drive/MyDrive/Colab_Notebooks/label_test.csv',dtype=str)
test.head()
number;label
0 201.jpg;3
1 202.jpg;3
2 203.jpg;1
3 204.jpg;3
4 205.jpg;3
train_folder = '/content/drive/MyDrive/Colab_Notebooks/bilder_train'
test_folder = '/content/drive/MyDrive/Colab_Notebooks/bilder_test'
import os
import numpy as np
import glob
import shutil
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Conv2D, Flatten, Dropout, MaxPooling2D, BatchNormalization
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from keras import regularizers, optimizers
train_gen = ImageDataGenerator(rescale=1./255,
rotation_range=45,
width_shift_range=.15,
height_shift_range=.15,
horizontal_flip=True,
zoom_range=0.5)
test_gen = ImageDataGenerator(rescale=1./255)
train_data = train_gen.flow_from_dataframe(dataframe = train,
directory = train_folder,
x_col = 'number',
y_col = 'label',
seed = 42,
batch_size = 10,
shuffle = True,
class_mode='categorical',
target_size = (100, 100))
test_data = test_gen.flow_from_dataframe(dataframe = test,
directory = test_folder,
x_col = 'number',
y_col = None,
seed = 42,
batch_size = 10,
shuffle = False,
class_mode='categorical',
target_size = (100, 100))
这是错误消息
KeyError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2897 try:
-> 2898 return self._engine.get_loc(casted_key)
2899 except KeyError as err:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'number'
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
6 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2898 return self._engine.get_loc(casted_key)
2899 except KeyError as err:
-> 2900 raise KeyError(key) from err
2901
2902 if tolerance is not None:
KeyError: 'number'
我完全不知道为什么会发生此错误。有人知道这里发生了什么吗?
解决方案
您需要添加sep=;
(CSV 分隔符)添加pd.read_csv
函数的末尾。由于它的默认sep
值是,
这样,它将解释number;label
为单个列而不是 2 个单独的列
import pandas as pd
train= pd.read_csv('/content/drive/MyDrive/Colab_Notebooks/label_train.csv',dtype=str, sep=';')
train.head()
test = pd.read_csv('/content/drive/MyDrive/Colab_Notebooks/label_test.csv',dtype=str, sep=';')
test.head()
推荐阅读
- python - 我们如何解决这个python字符串问题
- reactjs - 选项中的命名参数
- linux - 信任区系统中非 SecureOS 和 SecureOS 中直接访问内存的过程如何工作
- javascript - 有没有办法以表格形式获取文件的完整路径?
- hive - 需要在 hive 中的日期范围内加入两个表,以获得每月 acc_no 级别的 prod catg 交易的磁盘率
- php - PHP 错误异常:只有变量应该在第 259 行的 C:\xampp\htdocs\library-api\master.php 中通过引用传递
- java - 为什么没有@StreamListener Kafka Streams 不能工作?
- swift - 如何在 SwiftUI 中将图像放在表单的背景中?
- html - 为什么 HTML5 音频标签在 iOS 上显示错误?
- mysql - MySQL Master-Master 复制错误 1062