首页 > 解决方案 > 多项式线性回归,我哪里错了?

问题描述

我正在做一个课程作业,这是一个问题:

编写一个函数,在 0 到 9 度的训练数据 X_train 上拟合多项式线性回归模型。对于每个模型,计算训练数据和测试数据的 R2R2(确定系数)回归分数,并返回这两者元组中的数组。

此函数应返回一个 numpy 数组元组 (r2_train, r2_test)。两个数组都应该有形状 (10,)

我的代码:`来自 sklearn.linear_model 导入 LinearRegression

from sklearn.preprocessing import PolynomialFeatures
from sklearn.metrics.regression import r2_score


np.random.seed(0)
n = 15
x = np.linspace(0,10,n) + np.random.randn(n)/5
y = np.sin(x)+x/6 + np.random.randn(n)/10


X_train, X_test, y_train, y_test = train_test_split(x, y, random_state=0)
def answer_two():

def answer_two():
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.metrics.regression import r2_score

# Your code here


def r2_traintest(deg):  

    poly=PolynomialFeatures(deg)
    model=LinearRegression()
    X_f=poly.fit_transform(X_train.reshape(-1,1))

    a=model.fit(X_f,y_train)


    dee=a.predict(poly.fit_transform(X_train.reshape(-1,1)))

    deez=r2_score(dee,y_train)

    gin=a.predict(poly.transform(X_test.reshape(-1,1)))

    mint=r2_score(gin,y_test)

    return deez,mint


lst=[]
lsts=[]

for x in range(0,10,1):

    lst.append(r2_traintest(x)[0])

    lsts.append(r2_traintest(x)[1])


return (np.array(lst),np.array(lsts))

不幸的是,这给了我一个错误的答案,我错过了什么,请帮忙。

标签: pythonmachine-learningscikit-learnregressionsupervised-learning

解决方案


看来您正在反转 r2_score 函数中的参数。它一定要是r2_score(y_true, y_pred)

这是我的代码:

import numpy as np
from sklearn.preprocessing import PolynomialFeatures
from sklearn.metrics.regression import r2_score
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

def fit_poly(deg):
    poly = PolynomialFeatures(deg)
    model = LinearRegression()
    X_poly = poly.fit_transform(X_train.reshape(-1, 1))
    model.fit(X_poly, y_train)

    y_pred_train = model.predict(poly.fit_transform(X_train.reshape(-1, 1)))
    r2_train = r2_score(y_train, y_pred_train)

    y_pred_test = model.predict(poly.transform(X_test.reshape(-1, 1)))
    r2_test = r2_score(y_test, y_pred_test)

    return r2_train, r2_test


np.random.seed(0)
n = 15
x = np.linspace(0,10,n) + np.random.randn(n)/5
y = np.sin(x)+x/6 + np.random.randn(n)/10

X_train, X_test, y_train, y_test = train_test_split(x, y, random_state=0)

lst=[]
lsts=[]

for x in range(0,10,1):
    lst.append(fit_poly(x)[0])
    lsts.append(fit_poly(x)[1])

print(lst, lsts)

结果是:

[0.0, 0.4292457781234663, 0.45109980444082465, 0.5871995368779847, 0.9194194471769304, 0.97578641430682, 0.9901823324795082, 0.9935250927840416, 0.996375453877599, 0.9980370625664945]

[-0.4780864173714179, -0.45237104233936676, -0.0685698414991589, 0.005331052945740433, 0.7300494281871148, 0.8770830091614791, 0.9214093981415002, 0.9202150411139083, 0.6324795282222648, -0.645253216177847]

使用您的代码,值有时会高于 1。

顺便说一句,您的新版本代码更清晰,如您所见,我复制粘贴了很多:)


推荐阅读