首页 > 解决方案 > Pycebox IcePlot 不能在 Xgboost 上工作,而在随机森林上工作

问题描述

当我使用 XGBoost 运行 Pycebox 时出现以下错误,训练运行完美,但不确定为什么在使用 iceplot 时会出现 [fx] 字段。我也双重确认它们不在数据集中

ValueError: feature_names mismatch: ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)'] ['f0', 'f1', 'f2', 'f3']
expected petal width (cm), petal length (cm), sepal length (cm), sepal width (cm) in input data
training data did not have the following fields: ***f3, f1, f0, f2***

我创建了一个使用 iris 数据的示例

XGboost 代码:

    from sklearn.datasets import load_iris
    from pycebox.ice import ice, ice_plot
    from sklearn.model_selection import train_test_split
    import pandas as pd
    import numpy as np
    from sklearn.ensemble import RandomForestRegressor
    import xgboost as xgb
    import matplotlib.pyplot as plt

    iris = load_iris()
    data1 = pd.DataFrame(data= np.c_[iris['data'], iris['target']],
                         columns= iris['feature_names'] + ['target'])
    target = data1['target']
    training = data1.drop(['target'],axis=1)

    X_train, X_test, y_train, y_test = train_test_split(training, target, test_size=0.4)
    xg_reg = xgb.XGBRegressor(random_state=1234,eval_metric='rmse',n_jobs=-1)
    xg_reg.fit(X_train,y_train)
    forty_ice_df = ice(data=X_train, column='petal length (cm)', 
                   predict=xg_reg.predict)
    ice_plot(forty_ice_df, c='dimgray', linewidth=0.3)
    plt.ylabel('Pred. Target')
    plt.xlabel('petal length (cm)')

虽然它适用于随机森林

rf = RandomForestRegressor(random_state = 1234, n_jobs=18)
rf.fit(X_train, y_train)
forty_ice_df = ice(data=X_train, column='petal length (cm)', 
                   predict=rf.predict)
ice_plot(forty_ice_df, c='dimgray', linewidth=0.3)
plt.ylabel('Pred. Target')
plt.xlabel('petal length (cm)')

标签: pythonmachine-learningscikit-learnxgboost

解决方案


只需X_train更改X_train.values


推荐阅读