首页 > 解决方案 > ValueError:发现样本数量不一致的输入变量:[1319, 245]

问题描述

我面临与以下相关的问题 train_test_split

final = []
final.append(dataset)
final.append(dataset1)
X = dataset[:,0:2]
y = dataset1[:,2]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.15, random_state=42)

错误:

Traceback (most recent call last):

  File "C:\Users\Lenovo\anaconda3\thesis code\TC_code.py", line 73, in <module>
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.15, random_state=42)

  File "C:\Users\Lenovo\anaconda3\lib\site-packages\sklearn\model_selection\_split.py", line 2172, in train_test_split
    arrays = indexable(*arrays)

  File "C:\Users\Lenovo\anaconda3\lib\site-packages\sklearn\utils\validation.py", line 299, in indexable
    check_consistent_length(*result)

  File "C:\Users\Lenovo\anaconda3\lib\site-packages\sklearn\utils\validation.py", line 262, in check_consistent_length
    raise ValueError("Found input variables with inconsistent numbers of"

ValueError: Found input variables with inconsistent numbers of samples: [1319, 245]

标签: pythonscikit-learntrain-test-split

解决方案


X检查和的形状y。它必须具有相同的行数。

print(X.shape)
print(y.shape)

if X.shape[0] != y.shape[0]:
  print("X and y rows are mismatched, check dataset again")

笔记:

  • X 和 y 的行应该相同

您已经使用了 for datasetwhichX应该是您代码中的主要错误。dataset1y

看这里:

X = dataset[:,0:2]
y = dataset1[:,2]

datasetdataset1是两个不同的数据框,可能代表两个不同的数据。


推荐阅读