首页 > 解决方案 > Catboost 分类器在拟合时出现错误

问题描述

%%time
from catboost import Pool
from catboost import cv

from sklearn.model_selection import StratifiedKFold

n_fold = 4 # amount of data folds
folds = StratifiedKFold(n_splits=n_fold, shuffle=True, random_state=SEED)

params = {'loss_function':'Logloss',
          'eval_metric':'AUC',
          'verbose': 200,
          'random_seed': SEED,
          'nan_mode' : 'Forbidden'
         }

X_test = test_df.drop(columns='attended').reset_index(drop=True)

cat_features1 = X_test.columns.to_list()
test_data = Pool(data=X_test, 
                 cat_features=cat_features1)

scores = []
prediction = np.zeros(test_data.shape[0])

for fold_n, (train_index, valid_index) in enumerate(folds.split(X, y)):
    
    X_train, X_valid = X.iloc[train_index], X.iloc[valid_index] #train and validation data splits
    y_train, y_valid = y.iloc[train_index], y.iloc[valid_index]
    
    train_data = Pool(data=X_train, 
                      label=y_train, 
                      cat_features=cat_features1)
    valid_data = Pool(data=X_valid, 
                      label=y_valid,
                      cat_features=cat_features1)
    model = CatBoostClassifier(**params)
    model.fit(train_data,
              eval_set=valid_data, 
              use_best_model=True,
              plot=True ) 
    
    score = model.get_best_score()['validation_0']['AUC']
    scores.append(score)

    y_pred = model.predict_proba(test_data)[:, 1]
    prediction += y_pred
    
    #y_pred = model.predict(test_data)[:, 1]
    #prediction += y_pred

prediction /= n_fold
print('\n','CV mean: {:.4f}, CV std: {:.4f}'.format(np.mean(scores), np.std(scores)))

收到上述代码的错误。错误发生在附加分数处。这段代码已经为我工作了将近六个月。但是在下载 anaconda 之后,我可能弄乱了软件包版本。并得到以下错误。有人可以在这里找出问题吗?提前致谢!

KeyError 中的 KeyError Traceback(最近一次调用最后一次):'validation_0'

标签: pythonscikit-learn

解决方案


推荐阅读