首页 > 解决方案 > 如何在sklearn中拟合数据

问题描述

我想编写读取 csv 文件的代码,然后使用线性回归进行预测。
CSV 文件是这样的:

数学 物理
17 15
16 12
18 19

我试试这段代码:

import pandas as pd
from sklearn.linear_model import LinearRegression

score_file = pd.read_csv('scores.csv')
math_score = score_file['math']
physic_score = score_file['physics']
cls = LinearRegression().fit(math_score,physic_score)

但它给了我这个错误:

如果您的数据具有单个特征,则使用 array.reshape(-1, 1) 重塑您的数据,如果它包含单个样本,则使用 array.reshape(1, -1)

标签: pythonpandasscikit-learn

解决方案


import pandas as pd
from sklearn.linear_model import LinearRegression

score_file = pd.read_csv('scores.csv')
# score_file = pd.DataFrame.from_dict(
#     {'math': [17, 16, 18], 
#      'physics': [15, 12, 19]})

physic_score = score_file['physics']

print(score_file.shape)  # (3, 2)
print(physic_score.shape) # (3,)

# take care of the dimentions
cls = LinearRegression().fit(score_file,physic_score)

# this should be made with a test subdataset or so...
predictions = cls.predict(score_file)

print(predictions) # [15. 12. 19.]

推荐阅读