python - 我如何获取 targetencoder 的参数名称?网格搜索
问题描述
我有以下情况:
preprocess = make_column_transformer(
(SimpleImputer(strategy='constant',fill_value = 0),numeric_cols),
(ce.TargetEncoder(),['country'])
)
pipeline = make_pipeline(preprocess,XGBClassifier())
pipeline[0].get_params().keys()
dict_keys(['n_jobs', 'remainder', 'sparse_threshold', 'transformer_weights', 'transformers', 'verbose', 'simpleimputer', 'targetencoder', 'simpleimputer__add_indicator', 'simpleimputer__copy', 'simpleimputer__fill_value', 'simpleimputer__missing_values', 'simpleimputer__strategy', 'simpleimputer__verbose', 'targetencoder__cols', 'targetencoder__drop_invariant', 'targetencoder__handle_missing', 'targetencoder__handle_unknown', 'targetencoder__min_samples_leaf', 'targetencoder__return_df', 'targetencoder__smoothing', 'targetencoder__verbose'])
然后我希望对平滑因子进行网格搜索:
所以:
param_grid = {
'xgbclassifier__learning_rate': [0.01,0.005,0.001],
'targetencoder__smoothing': [1, 10, 30, 50]
}
pipeline = make_pipeline(preprocess,XGBClassifier())
# Initialize Grid Search Modelg
clf = GridSearchCV(pipeline,param_grid = param_grid,scoring = 'neg_mean_squared_error',
verbose= 1,iid= True,
refit = True,cv = 3)
clf.fit(X_train,y_train)
但是我收到此错误:
ValueError:估计器管道的参数transformer_targetencoder无效(steps = [('columntransformer',ColumnTransformer(transformers ...
如何访问平滑参数?
解决方案
使用您的示例,它将是columntransformer__targetencoder__smoothing
. 为了重现管道,首先我使用示例数据集并定义列:
from sklearn.compose import make_column_transformer
from sklearn.pipeline import make_pipeline
from sklearn.impute import SimpleImputer
import category_encoders as ce
from xgboost import XGBClassifier
from sklearn.model_selection import GridSearchCV
X_train = pd.DataFrame({'x1':np.random.normal(0,1,50),
'x2':np.random.normal(0,1,50),
'country':np.random.choice(['A','B','C'],50)})
y_train = np.random.binomial(1,0.5,50)
numeric_cols = ['x1','x2']
preprocess = make_column_transformer(
(SimpleImputer(strategy='constant',fill_value = 0),numeric_cols),
(ce.TargetEncoder(),['country'])
)
pipeline = make_pipeline(preprocess,XGBClassifier())
您应该查看更高级别的键:
pipeline.get_params().keys()
然后设置网格,确保平滑是浮点数(参见这个问题):
param_grid = { 'columntransformer__targetencoder__smoothing': [1.0, 10.0],
'xgbclassifier__learning_rate': [0.01,0.001]}
pipeline = make_pipeline(preprocess,XGBClassifier())
clf = GridSearchCV(pipeline,param_grid = param_grid,scoring = 'neg_mean_squared_error',
verbose= 1,refit = True,cv = 3)
clf.fit(X_train,y_train)
它应该工作
推荐阅读
- javascript - WebDriverIO 使用元素索引选择
- amazon-web-services - 检索 DynamoDB 表中所有数据并随后清除表的最佳方法
- c++ - 在循环缓冲区中存储结构数据
- c# - 从 C# 中的子类修改文本框
- c# - TPL - 减少后期延迟
- django-views - 无法使用 TestCase 库对 Django APIviews 进行单元测试?
- scala - 如何在 Scala 中创建文件输入流以从 URL 下载视频?
- c++ - 'Microsoft::WRL::FtmBase' 类具有虚函数,但析构函数不是虚函数
- build - Xcode 9.4.1 无法构建模块并重新定义模块 - 无法构建
- spring-boot - 使用 spring 和 querydsl 排序