python - Sklearn:将数据拟合到逻辑回归模型时出现类型错误
问题描述
我在使用 Logistic 回归进行 fit_transform 时收到以下错误
from sklearn.feature_extraction.text import TfidfVectorizer
tfidf_vectorizer = TfidfVectorizer()
X_train_tfidf = tfidf_vectorizer.fit_transform(X_train)
X_train_tfidf.shape
from sklearn.linear_model import LogisticRegression
clf = LogisticRegression(solver = 'lbfgs')
clf.fit(X_train_tfidf,y_train)
我经历了这个线程LabelEncoder: TypeError: '>' not supported between 'float' and 'str'但这也没有帮助。任何帮助将不胜感激
TypeError:“float”和“str”的实例之间不支持“<”
根据上面的链接,我也没有任何空值..
X_train.isnull().value_counts()
False 2584
Name: Headline, dtype: int64
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-65-e676010d2b44> in <module>
3 clf = LogisticRegression(solver = 'lbfgs')
4
----> 5 clf.fit(X_train_tfidf,y_train)
~/Desktop/Anaconda/anaconda3/envs/nlp_course/lib/python3.7/site-packages/sklearn/linear_model/logistic.py in fit(self, X, y, sample_weight)
1284 X, y = check_X_y(X, y, accept_sparse='csr', dtype=_dtype, order="C",
1285 accept_large_sparse=solver != 'liblinear')
-> 1286 check_classification_targets(y)
1287 self.classes_ = np.unique(y)
1288 n_samples, n_features = X.shape
~/Desktop/Anaconda/anaconda3/envs/nlp_course/lib/python3.7/site-packages/sklearn/utils/multiclass.py in check_classification_targets(y)
166 y : array-like
167 """
--> 168 y_type = type_of_target(y)
169 if y_type not in ['binary', 'multiclass', 'multiclass-multioutput',
170 'multilabel-indicator', 'multilabel-sequences']:
~/Desktop/Anaconda/anaconda3/envs/nlp_course/lib/python3.7/site-packages/sklearn/utils/multiclass.py in type_of_target(y)
285 return 'continuous' + suffix
286
--> 287 if (len(np.unique(y)) > 2) or (y.ndim >= 2 and len(y[0]) > 1):
288 return 'multiclass' + suffix # [1, 2, 3] or [[1., 2., 3]] or [[1, 2]]
289 else:
~/Desktop/Anaconda/anaconda3/envs/nlp_course/lib/python3.7/site-packages/numpy/lib/arraysetops.py in unique(ar, return_index, return_inverse, return_counts, axis)
231 ar = np.asanyarray(ar)
232 if axis is None:
--> 233 ret = _unique1d(ar, return_index, return_inverse, return_counts)
234 return _unpack_tuple(ret)
235
~/Desktop/Anaconda/anaconda3/envs/nlp_course/lib/python3.7/site-packages/numpy/lib/arraysetops.py in _unique1d(ar, return_index, return_inverse, return_counts)
279 aux = ar[perm]
280 else:
--> 281 ar.sort()
282 aux = ar
283 mask = np.empty(aux.shape, dtype=np.bool_)
TypeError: '<' not supported between instances of 'float' and 'str'
解决方案
推荐阅读
- c++ - 卷积:conv2 matlab 到 opencv
- ruby - Ruby 迁移:使用 gem 将 Excel 文件转换为 XML
- c# - 当字符串为空时,C# Automapper 如何将字段设置为 null
- python - 如何在列表中运行我的标记器函数 - 模块对象不可调用?
- javascript - 安装在 buildfiresdk 中的 webpack 模板中的文档
- python - Python中关于默认类函数的问题
- python - python中的硒:返回特殊值
- xml - Odoo 移动字段
- php - 通过Id获取新添加对象的id(使用facade)
- javascript - 反应 | 更新部分对象并设置状态