首页 > 解决方案 > 如何计算准确度?

问题描述

我正在尝试计算 twitter 情绪分析项目的准确性。但是,我收到了这个错误,我想知道是否有人可以帮我计算准确度?谢谢

错误:ValueError: Classification metrics can't handle a mix of continuous and multiclass targets

我的代码:

import re
import pickle
import numpy as np
import pandas as pd


from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import r2_score
from sklearn.metrics import accuracy_score

from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
df = pd.read_csv("updated_tweet_info.csv")
data =  df.fillna(' ')

train,test = train_test_split(data, test_size = 0.2, random_state = 42)

train_clean_tweet=[]
for tweet in train['tweet']:
    train_clean_tweet.append(tweet)
test_clean_tweet=[]
for tweet in test['tweet']:
    test_clean_tweet.append(tweet)

v = CountVectorizer(analyzer = "word")
train_features= v.fit_transform(train_clean_tweet)
test_features=v.transform(test_clean_tweet)


lr = RandomForestRegressor(n_estimators=200)
fit1 = lr.fit(train_features, train['clean_polarity'])
pred = fit1.predict(test_features)
accuracy = accuracy_score(pred, test['clean_polarity'])`

标签: pythontwitterdata-sciencesentiment-analysis

解决方案


您正在尝试使用 accuracy_score 方法,但准确度是一个分类指标。

在您的情况下,请尝试使用回归度量方法,例如:mean_squared_error()然后应用np.sqrt(). 这将返回均方根误差。数字越低越好。您也可以在这里查看更多详细信息。

尝试这个:

 import numpy as np
 rmse = np.sqrt(mean_squared_error(test['clean_polarity'], pred))

这个人也有同样的问题


推荐阅读