首页 > 解决方案 > 我正在尝试在下面运行交叉验证,但出现以下错误

问题描述

这就是我正在尝试使用 RELU、softmax 和交叉验证来帮助从 x(工作日、upc、扫描计数、部门描述和细线编号)预测行程类型 (y) 的内容。

数据来自 Kaggle ( https://www.kaggle.com/c/walmart-recruiting-trip-type-classification

>>import requests
>>df = ('Documents/train.csv')
>>DataLabels = ["Trip_Type", "Visit_Number", "Weekday", "UPC", "Scan_Count", 
    "Department_Description", "Fine_Line_number" ] 
>>data = pd.read_csv(df, header=None, names=DataLabels)

>>Weekday_mapping = {
       'Monday': (0),
       'Tuesday': (1),
       'Wednesday': (2),
       'Thursday': (3),
       'Friday': (4),
       'Saturday': (5),
       'Sunday': (6)
         }

>>data['Weekday'] = data['Weekday'].map(Weekday_mapping)
>>data


>>x=data.Weekday, data.UPC, data.Scan_Count, data.Department_Description, 
    data.Fine_Line_number
>>y=data.Trip_Type
>>X_train, X_test, y_train, y_test = train_test_split(x, y, test_size = 0.3, 
    random_state = 0)    


ValueError: Found input variables with inconsistent numbers of samples: [5, 647054]

标签: pythonpython-3.xjupyter-notebookjupyter

解决方案


推荐阅读