首页 > 解决方案 > 拆分数据时出现问题:KeyError: "None of [Int64Index([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], dtype='int64')] 在[列]"

问题描述

我正在尝试对某些数据 wine.data 执行训练测试拆分,但是在初始化 x 和 y 时:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler

from sklearn.model_selection import cross_val_score

wine =  pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data")

print(wine.shape)
wine.head()
X = wine[np.arange(1,14)]
y = wine[0]

当我收到错误消息时,此段下方的其余代码将不会运行:

KeyError: "None of [Int64Index([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], dtype='int64')] are in the [columns]"

我试图通过更改 X 值的范围或更改 np.arange 函数来解决此问题,但均无济于事。

任何帮助或建议将不胜感激,谢谢!

标签: pythonpandasnumpytrain-test-split

解决方案


您忘记添加header=None到数据框构造函数。您正在下载的 csv 没有标题行。因此,如果您不指定header=None,则第一行数据将用作表头。

尝试

wine =  pd.read_csv(
    "https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data",
    header=None
)

推荐阅读