首页 > 解决方案 > TypeError unhashable type: 'numpy.ndarray' 当我尝试绘制线性回归的结果时

问题描述

我正在尝试做一些简单的线性回归。我有一个包含书名、出版日期、作者姓名等的书籍数据集。我试图根据出版年份预测一本书的平均好读率。我使用了以下代码:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt  
import seaborn as seabornInstance 
from sklearn.model_selection import train_test_split 
from sklearn.linear_model import LinearRegression
from sklearn import metrics

df = pd.read_csv('books.csv', error_bad_lines= False, warn_bad_lines = False, encoding= 'utf-8')
df.columns = ['BookID', 'Title','Authors', 'Average_Rating', 'ISBN', 'ISBN13', 'Language', 'Pages', 'Ratings_Count', 'Text_Reviews_Count', 'Publication_Date', 'Publisher']

df.drop(columns = ["ISBN", "ISBN13"], inplace = True)
df.loc[df.Language == "en-US", "Language"] = "eng"
df.loc[df.Language == "en-GB", "Language"] = "eng"
df.loc[df.Language == "en-CA", "Language"] = "eng"
df.loc[df.Language != "eng", "Language"] = "other"

temp = df["Publication_Date"].str.split("/", n = 2, expand = True)
df["Publication_Month"] = temp[1]
df["Publication_Year"] = temp[2]

X = df["Publication_Year"].values.reshape(-1,1)
y = df["Average_Rating"].values.reshape(-1,1)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

regressor = LinearRegression()  
regressor.fit(X_train, y_train)
print(regressor.intercept_)
print(regressor.coef_)

y_pred = regressor.predict(X_test)

在我尝试将我的预测与训练集上的实际值进行对比之前,我没有任何问题。我不断收到 TypeError 不可散列的类型:'numpy.ndarray'。

plt.plot(X_test, y_pred, label = "Predictions", color='red', linewidth=2)
plt.plot(X_test, y_test, label = "Actual Values", color='blue', linewidth=2)
plt.xlabel('Year of publication')
plt.ylabel('Avg. Rating')  
plt.show()

我该如何解决这个问题?

标签: pythonpandasmatplotliblinear-regression

解决方案


推荐阅读