python - Find optimal Lasso/L1 regularization strength using cross validation for logistic regression in scikit learn
问题描述
For my logistic regression model, I would like to evaluate the optimal L1 regularization strength using cross validation (eg: 5-fold) in place of a single test-train set as shown below in my code:
from sklearn.model_selection import train_test_split
train_x, test_x, train_y, test_y = train_test_split(X_scaled,y, stratify=y, test_size=0.3,
random_state=2)
#Evaluate L1 regularization strengths for reducing features in final model
C = [10, 1, .1, 0.05,.01,.001] # As C decreases, more coefficients go to zero
for c in C:
clf = LogisticRegression(penalty='l1', C=c, solver='liblinear', class_weight="balanced")
clf.fit(train_x, train_y)
pred_y=clf.predict(test_x)
print("Model performance with Inverse Regularization Parameteter, C = 1/λ VALUE: ", c)
cr=metrics.classification_report(test_y, pred_y)
print(cr)
print('')
Can somebody show me how to do this over 5-distinct test-train sets using cross-validation (i.e., without replicating the above code 5-times and distinct random states)?
解决方案
实际上,classification_report
作为一个指标并没有定义为内部的评分指标sklearn.model_selection.cross_val_score
。所以,我将f1_micro
在下面的代码中使用:
from sklearn.model_selection import cross_val_score
#Evaluate L1 regularization strengths for reducing features in final model
C = [10, 1, .1, 0.05,.01,.001] # As C decreases, more coefficients go to zero
for c in C:
clf = LogisticRegression(penalty='l1', C=c, solver='liblinear', class_weight="balanced")
# using data before splitting (X_scaled) and (y)
scores = cross_val_score(clf, X_scaled, y, cv=5, scoring="f1_micro") #<-- add this
print(scores) #<-- add this
该变量scores
现在是一个包含五个值的列表,表示f1_micro
您的分类器在原始数据的五个不同拆分上的值。
如果要在 中使用另一个评分指标sklearn.model_selection.cross_val_score
,可以使用以下命令获取所有可用的评分指标:
print(metrics.SCORERS.keys())
此外,您可以使用多个评分指标;以下同时使用f1_micro
and f1_macro
:
from sklearn.model_selection import cross_validate
cross_validate(clf, X_scaled, y, cv=5, scoring=["f1_micro", "f1_macro"])
推荐阅读
- vue.js - 访问脚本标签内的 v-slot 值
- html - 使用 ReactJS 在单击时切换 div
- javascript - 我可以在每次测试后让 Jest 逐步报告测试结果吗?
- jquery - jQuery函数后div突然跳到页面顶部
- cobol - 文件的记录包含子句的第一个整数与最小值不同
- android - 如何更新 MutableList 中的对象值?
- ms-word - 如何在 Word 中为自定义加载项启用工具提示?
- symfony - heroku symfony clearDB Mysql
- python - 为什么即使两个虚拟环境使用相同的 python 版本,anaconda 也会复制 python?
- ruby-on-rails - RubyOnRails : 从数据库中读取一次参数