首页 > 解决方案 > 多元回归,用多个自变量重塑输入

问题描述

我正在对我的数据进行多重回归,但绘制数据会引发错误:ValueError: x and y must be the same size.

x.shape is (10000, 2) 
#Since I have two independent  x = dataset[['green', 'blue']]

y.shape is (10000,)

如何重塑数组?因为我在 x 中有两个自变量。

代码:

dataset = pd.read_csv('colors.csv')


x = dataset[['green', 'blue']] #independent variable
y = dataset['value']


x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.3, random_state = 100)


mlr = LinearRegression()
mlr.fit(x_train, y_train)


print("Intercept: ", mlr.intercept_)
print("Coefficients:")
list(zip(x, mlr.coef_))

#Prediction of test set
y_pred_mlr= mlr.predict(x_test)
#Predicted values
print ("input test set", x_test)
print("Prediction for test set: {}".format(y_pred_mlr))

mlr_diff = pd.DataFrame({'Actual value': y_test, 'Predicted value': y_pred_mlr})


plt.scatter(x_train, y_train,color='g')
plt.plot(x_train, mlr.predict(x_train),color='k')

plt.show()

谢谢

标签: pythonnumpymatplotlibscikit-learnregression

解决方案


对于这个问题,您需要将每个维度分散在一个图中。

也许这段代码可以帮助你:

color = ['g','b']
plot_number = 1
fig = plt.figure(figsize = (15,5))
for i in range(x_train.shape[1]):
    ax, plot_number = fig.add_subplot(1, 2, plot_number), plot_number+1
    ax.scatter(x_train.iloc[:,i], y_train, color=color[i])
    ax.set_xlabel("x_train", fontsize = 18)
    ax.set_ylabel("Y", rotation = 0, fontsize = 18)
plt.show()

输出如下: 在此处输入图像描述


推荐阅读