首页 > 解决方案 > sklearn train_test_split 错误:发现输入变量的样本数不一致

问题描述

x = dummiesfinal.iloc[9]
y = dummiesfinal.iloc[0:8,10:47]
x = np.array(x).T.reshape(-1,1)
y = np.array(y)
np.shape(x)
(48, 1)
np.shape(y)
(8, 37)
plt
from sklearn.model_selection import train_test_split
from matplotlib import pyplot as plt
x_train, x_test, y_train, y_test = train_test_split(x,y, test_size=0.30)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-138-90e4256c22db> in <module>
      1 from sklearn.model_selection import train_test_split
      2 from matplotlib import pyplot as plt
----> 3 x_train, x_test, y_train, y_test = train_test_split(x,y, test_size=0.30)

~\anaconda3\lib\site-packages\sklearn\model_selection\_split.py in train_test_split(*arrays, **options)
   2125         raise TypeError("Invalid parameters passed: %s" % str(options))
   2126 
-> 2127     arrays = indexable(*arrays)
   2128 
   2129     n_samples = _num_samples(arrays[0])

~\anaconda3\lib\site-packages\sklearn\utils\validation.py in indexable(*iterables)
    291     """
    292     result = [_make_indexable(X) for X in iterables]
--> 293     check_consistent_length(*result)
    294     return result
    295 

~\anaconda3\lib\site-packages\sklearn\utils\validation.py in check_consistent_length(*arrays)
    254     uniques = np.unique(lengths)
    255     if len(uniques) > 1:
--> 256         raise ValueError("Found input variables with inconsistent numbers of"
    257                          " samples: %r" % [int(l) for l in lengths])
    258 

ValueError: Found input variables with inconsistent numbers of samples: [48, 8]

我正在运行一个逻辑回归模型,但我在这一点上陷入困​​境,请提出相同的建议

如果有更好的其他方法可以让我知道,因为我已经被困在同一个代码上很长时间了

谢谢你

标签: pythonarraysreshapelogistic-regressionshapes

解决方案


推荐阅读