首页 > 解决方案 > 在数据帧上应用 OneHotEncoder 时出现错误“传递值的形状为 (8708, 27),索引暗示 (8708, 4)”

问题描述

我在下面采样的数据帧上练习 OneHotEncoder:

datetime    season  holiday workingday  weather         temp    atemp   humidity    windspeed   Total_booking   Hour    weekday    Month    date

5/2/2012 19:00  Summer  0   1       Clear + Few clouds  22.14   25.76   77          16.9979          504         19     Wednesday   May     5/2/2012

9/5/2012 4:00   Fall    0   1       Clear + Few clouds  28.7    33.335  79          19.0012         5           4       Wednesday  September9/5/2012

代码:

'df' 是上面采样的数据框。“categoryVariableList”是需要用于 OneHotEncoder 的 dataframe(df) 中的列列表。

categoryVariableList = ["weekday","Month","season","weather"]

ohe = OneHotEncoder(categories='auto')
feature_arr = ohe.fit_transform(df[categoryVariableList]).toarray()
feature_labels = ohe.categories_

feature_labels = np.array(feature_labels).ravel()

features = pd.DataFrame(feature_arr, columns=feature_labels)
features

我得到的输出如下:

ValueError: Wrong number of items passed 27, placement implies 4
.....
Shape of passed values is (8708, 27), indices imply (8708, 4)

这里出了什么问题?请指教。

标签: pythonpandasdataframe

解决方案


也许您可以改用标签编码器

df_x[categorical_cols_x] = df_x[categorical_cols_x].apply(lambda col: le.fit_transform(col)) df_x[categorical_cols_x]


推荐阅读