python-3.x - 多标签分类器代码错误
问题描述
我正在尝试训练分类器按类型标记电影。这部电影的情节可能属于不止一种类型。 这就是我的数据框的样子 ,当我试图弄清楚测试集中每种电影类型的准确度得分是多少时,我不断收到此错误消息。错误消息: TypeError:“FramePlotMethods”对象不可迭代
有人可以解释我做错了什么吗?
我从https://github.com/davidsbatista/text-classification/blob/master/movies_genres_en.csv.bz2获得了电影数据
这是一开始的代码
df = pd.read_csv("movies_genres_en.csv", delimiter='\t')
df.drop('plot_lang', axis=1, inplace=True)
df.info()
# using for loop get a count of movies by genre
df_genres = df.drop(['plot', 'title'], axis=1)
counts = []
categories = list(df_genres.columns.values)
for i in categories:
counts.append((i, df_genres[i].sum()))
df_stats = pd.DataFrame(counts, columns = ['genre','#movies'])
df_stats
# Create a fuction to clean the text
def clean_text(text):
text = text.lower()
text = re.sub(r"what's", "what is ", text)
text = re.sub(r"\'s", " ", text)
text = re.sub(r"\'ve", " have ", text)
text = re.sub(r"can't", "can not ", text)
text = re.sub(r"n't", " not ", text)
text = re.sub(r"i'm", "i am ", text)
text = re.sub(r"\'re", " are ", text)
text = re.sub(r"\'d", " would ", text)
text = re.sub(r"\'ll", " will ", text)
text = re.sub(r"\'scuse", " excuse ", text)
text = re.sub('\W', ' ', text)
text = re.sub('\s+', ' ', text)
text = text.strip(' ')
return text
# clean up the text in plot
df['plot'] = df['plot'].map(lambda com : clean_text(com))
将数据拆分为训练集和测试集
train, test = train_test_split(df, random_state=42, test_size = 0.33,
shuffle=True)
x_train = train.plot
x_test = test.plot
# Define a pipeline combining a text feature extractor with multi lable
classifier
NB_pipeline = Pipeline([
('tfidf', TfidfVectorizer(stop_words='english')),
('clf', OneVsRestClassifier(MultinomialNB(
fit_prior=True, class_prior=None))),
])
NB_pipeline.fit(x_train, train[genre])
prediction = NB_pipeline.predict(x_test)
accuracy_score(test[genre], prediction)
当我运行最后一个代码块 TypeError: 'FramePlotMethods' object is not iterable时出现此错误
我在创建 df 时做错了什么?
解决方案
推荐阅读
- autohotkey - 如何在AHK中将F1到F12键绑定到鼠标滚轮
- javascript - 在ngrx效果中api调用后获取数据
- go - 构建命令行参数:无法加载本地包:找不到提供包的模块
- python-2.7 - anaconda 未正确安装 python 2.7
- selenium-webdriver - 如何将 Botium Box 与 selenium 集成
- python - 从 Python 中的列表中递归地对相邻的元组进行分组
- excel - 条件四舍五入 If/then 语句 - Excel
- java - 还有其他方法可以在 Java 面板上输出网络视频流吗?
- c# - 创建具有属性的矩形列表
- shell - 无法删除最后一个字段 CSV 文件