首页 > 解决方案 > 如何处理具有一列 ID 的模型数据集?

问题描述

我正在尝试为 NFL 选秀前景成功概率建立一个模型。我很难找到一种方法来打印带有相应模型输出的玩家姓名。例如,目前它打印这样的内容“[79 22 36 72 20 48 2 68 16 36 11 68 68 16 22 17 60 62 15 17 11 68 0 84 28 22 45 48 79 84 2 37 68]”,我想与这些输出相关联的播放器也要打印。我正在使用我在网上找到的一些模板代码,用于我想要构建的模型类型。我将在下面发布。

数据链接:https ://docs.google.com/spreadsheets/d/1BQa34rfq7oC3jOO65c4xUqKTuhDGKf46pPwGmjSS3ko/edit?usp=sharing

“玩家”列在训练期间确实无关紧要,因为这些数据是可追溯到 2004 年的历史草稿,但显然对于最终输出,当我要求模型预测今年的前景时,我也需要名称输出。

    import pandas as pd
    import xgboost
    from sklearn import model_selection
    from sklearn.metrics import accuracy_score
    from sklearn.preprocessing import LabelEncoder
    
    # load data
    data = pd.read_csv(r"C:\Users\yanke\Documents\NFLDraft\QBDataSet.csv", index_col=0)
    dataset = data
    
    # split data into X and y
    X = dataset.iloc[:,0:4]
    Y = dataset.iloc[:,4]
    # encode string class values as integers
    label_encoder = LabelEncoder()
    label_encoder = label_encoder.fit(Y)
    label_encoded_y = label_encoder.transform(Y)
    
    seed = 7
    test_size = 0.33
    X_train, X_test, y_train, y_test = model_selection.train_test_split(X, label_encoded_y, test_size=test_size, random_state=seed)
    
    # fit model no training data
    model = xgboost.XGBClassifier()
    model.fit(X_train, y_train)
    print(model)
    
    # make predictions for test data
    y_pred = model.predict(X_test)
    predictions = [round(value) for value in y_pred]
    
    # evaluate predictions
    accuracy = accuracy_score(y_test, predictions)
    print("Accuracy: %.2f%%" % (accuracy * 100.0))
    print(y_pred)

标签: pythonmachine-learningscikit-learnxgboost

解决方案


这行得通吗?

for player, prediction in zip(X_test.index, predictions):
  print(player, prediction)

输出:

Colin Kaepernick 3
Jeff Driskel 2
Dwayne Haskins 1
Colt McCoy 1
Ryan Lindley 2
Jameis Winston 2
Sam Darnold 1
Sam Bradford 1
Troy Smith 1
Johnny Manziel 1
Matthew Stafford 3
Kyler Murray 2
Daniel Jones 2
Gardner Minshew 1
Joe Webb 2
Curtis Painter 1
Andrew Luck 1
Josh Freeman 2
Landry Jones 1
Ryan Finley 1
Deshaun Watson 1
Marcus Mariota 1
Dan Orlovsky 1
Russell Wilson 2
Nathan Peterman 1
Kyle Orton 2
Paxton Lynch 2
Alex Smith 1
Brodie Croyle 1
Vince Young 2
Brandon Weeden 1
Teddy Bridgewater 1
Brett Hundley 1

推荐阅读