首页 > 解决方案 > 当我使用参数运行 GridSearchCV() 分类器时,出现这种错误:-ValueError: could not convert string to float: 'text'

问题描述

所以请问我该如何解决这种错误,请任何人指导我

X = df.iloc[:,:-2]
y = df.My_Labels 

from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score,confusion_matrix,classification_report
from sklearn.model_selection import KFold
import numpy as np
from sklearn.model_selection import GridSearchCV

log_class=LogisticRegression()
#grid={'C':10.0 **np.arange(-2,3),'penalty':['l1','l2'],'solver': [ 'lbfgs', 'liblinear']}
grid={'C':10.0 **np.arange(-2,3),'penalty': ['l1'], 'solver': [ 'lbfgs', 'liblinear', 'sag', 'saga'],'penalty': ['l2'], 'solver': ['newton-cg']} 
cv=KFold(n_splits=5,random_state=None,shuffle=False)

from sklearn.model_selection import train_test_split
X_train,X_test, y_train,y_test = train_test_split(X,y,test_size=0.3,random_state=10)

clf=GridSearchCV(log_class,grid,cv=cv,n_jobs=-1,scoring='f1_macro')
clf.fit(X_train.astype(str),y_train)

这是错误部分,当我为纯文本分类运行上述代码时得到的

ValueError                                Traceback (most recent call last)
<ipython-input-18-4d99cebd483c> in <module>
     17 
     18 clf=GridSearchCV(log_class,grid,cv=cv,n_jobs=-1,scoring='f1_macro')
---> 19 clf.fit(X_train.astype(str),y_train)

ValueError: could not convert string to float: 'great kindle text sucks cant use calling feature phone im connected wifi makes great calls'

标签: machine-learninglogistic-regressiontext-classificationgridsearchcvk-fold

解决方案


看起来您的数据集中有文本,请使用Scikit-Learn'sLabelEncoder将其编码为数值。如果您不知道如何使用,请使用此链接LabelEncoder

看看你的错误,我想你有一些评论专栏,如果你不想这样做LabelEncode,使用一些 NLP 技术并执行情感分析等等。


推荐阅读