首页 > 解决方案 > sklearn OneHotEncoder 形状错误

问题描述

我有一个数组

y_train: array([ 0,  0,  0, -1, 1, 0, -1, 0, ..., -1, 0, 1], dtype=int64)

我这样做了:

enc = OneHotEncoder()
y_train = enc.fit_transform(y_train.reshape(1,-1))

结果是

(0, 0)  1.0
(0, 1)  1.0
(0, 2)  1.0
(0, 3)  1.0
(0, 4)  1.0
(0, 5)  1.0

但我真正想要的是它是 onehot 编码,如下所示:

[1,0,0]
[1,0,0]
[0,1,0]
[0,0,1]
.....

如何解决?

标签: pythonscikit-learnone-hot-encoding

解决方案


toarray()将编码应用于变量后,您必须使用函数y_train

from sklearn import preprocessing
import numpy as np

y_train = np.array([0, 0, 0, -1, 1, 0, -1, 0, -1, 0, 1]).reshape(-1, 1)
enc = preprocessing.OneHotEncoder()
y_train = enc.fit_transform(y_train).toarray()
print(y_train)

你会得到这个输出:

[[0. 1. 0.]
 [0. 1. 0.]
 [0. 1. 0.]
 [1. 0. 0.]
 [0. 0. 1.]
 [0. 1. 0.]
 [1. 0. 0.]
 [0. 1. 0.]
 [1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

推荐阅读