python - 电影推荐系统的 scikit learn 中的 fit_transform 错误
问题描述
这是我尝试运行的完整代码。它运行得很好,但是从底部开始的第二行有错误
count_matrix = count.fit_transform(df['bag_of_words'])
我也不知道这bag_of_word
是从哪里来的..请建议代码编辑..
import pandas as pd
from rake_nltk import Rake
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.feature_extraction.text import CountVectorizer
df = pd.read_csv('https://query.data.world/s/uikepcpffyo2nhig52xxeevdialfl7')
df = df[['Title','Genre','Director','Actors','Plot']]
df.head()
# initializing the new column
df['Key_words'] = ""
for index, row in df.iterrows():
plot = row['Plot']
r = Rake()
r.extract_keywords_from_text(plot)
key_words_dict_scores = r.get_word_degrees()
row['Key_words'] = list(key_words_dict_scores.keys())
df.drop(columns = ['Plot'], inplace = True)
count = CountVectorizer()
count_matrix = count.fit_transform(df['bag_of_words'])
cosine_sim = cosine_similarity(count_matrix, count_matrix)
错误如下
Traceback (most recent call last):
File "C:\Python38\lib\site-packages\pandas\core\indexes\base.py", line 2897, in get_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 107, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 131, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1607, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1614, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'bag_of_words'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "movie2.py", line 36, in <module>
count_matrix = count.fit_transform(df['bag_of_words'])
File "C:\Python38\lib\site-packages\pandas\core\frame.py", line 2995, in __getitem__
indexer = self.columns.get_loc(key)
File "C:\Python38\lib\site-packages\pandas\core\indexes\base.py", line 2899, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 107, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 131, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1607, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1614, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'bag_of_words'
请告诉我该怎么办?
解决方案
推荐阅读
- c# - Unity Android C++ DllNotFoundException
- visual-studio-code - 在不同端口上使用 VS Code 调试 2 个 nodemon 实例
- sql - 仅在满足某些条件时才尝试强制转换列
- r - 在 data.table 的区间内按日期选择行
- python - SQL/Python (Django) - 将每一行加入整个表
- java - 基于运输时间的热图/轮廓(反向等时轮廓)
- java - JwtBuilder 类的 @autowired 不起作用,错误:需要找不到类型为“io.jsonwebtoken.JwtBuilder”的 bean
- c#-4.0 - 特定异常的 Hangfire 自动重试
- r - Shiny DT appearance messed up when selected rows used as reactive values
- java - 如何使用我的方案隐式打开我的应用程序?