首页 > 解决方案 > 有没有办法将自定义测试数据传递到 XGBoost 预测模型中?

问题描述

我已经拟合了我的训练和测试数据,现在我正在尝试为我的 XGBoost 模型提供定制数据。该数据与我的训练和测试格式相同,但它是一维数组而不是二维数组。我不断收到错误TypeError: can not initialize DMatrix from Series

这是我的代码:

#create test data series
newcol = ["away", "home"]
#enter names of teams to get
testrow = pd.Series(["Seoul Dynasty", "Dallas Fuel"], index=newcol)

#get all stats for each team
awayrow = get_team_stats(testrow[0])
homerow = get_team_stats(testrow[1])

#convert columns to proper team placement
awayrow = awayrow.rename(lambda x: "Away " + x)
homerow = homerow.rename(lambda x: "Home " + x)

#turn name into id
for name, team in ID_TO_NAME.items():
    if team == testrow[0]:
        testrow[0] = name
    if team == testrow[1]:
        testrow[1] = name
testrow = pd.concat([testrow, awayrow, homerow])
testrow = testrow[~testrow.index.str.contains("Rank")]

#predictions
rfcprediction = rfc.predict([testrow])
lrprediction = lr.predict([testrow])
knnprediction = knn.predict([testrow])
svprediction = sv.predict([testrow])

testrow = np.array([testrow]).reshap((1,-1))
xgprediction = xgboost.predict(testrow)

#turn id back into name
def convert_prediction(prediction):
    if prediction[0] == 1:
        #Home Won
        return ID_TO_NAME.get(testrow[1])
    if prediction[0] == 0:
        #Away Won
        return ID_TO_NAME.get(testrow[0])


print("Random Forest Prediction: ", convert_prediction(rfcprediction))
print(" ")
print("Logistic Regression Prediction: ", convert_prediction(lrprediction))
print(" ")
print("K Nearest Neighbors Prediction: ", convert_prediction(knnprediction))
print(" ")
print("SVC Prediction: ", convert_prediction(svprediction))
print(" ")

我被告知要重塑数组(这是一个系列类型),但这给了我一个 feature_names 不匹配。如何对 1 行系列进行 XGBoost 预测?

标签: machine-learningscikit-learnxgboost

解决方案


推荐阅读