首页 > 解决方案 > 获取每队得分的总和

问题描述

我正在分析一个足球比赛数据集,我想回答一个问题——每支球队的进球数和失球数。

我的数据集:

date         home_team    away_team    home_score    away_score
1873-03-08   England      Scotland     0             1
1873-03-09   Scotland     England      1             0
...          ...          ...          ...           ...

该函数接受 2 个参数 - 开始年份和结束年份

我一开始尝试有一个空列表,在遍历整个集合时添加国家/地区名称并附加他们得分的目标,但由于有许多不同的球队,我的列表不正确。

def total_goals(start, end):
        x = 0
        goals_scored = 0
        goals_scored_list = []
        goals_lost = 0
        goals_lost_list = []
        complete_list = []

        for item in range(len(data['home_team'])):
            date = int(data['date'][x][:4])
            if date >= start:
                if date <= end:
                    if int(data['home_score'][x]) > int(data['away_score'][x]):
                        goals_scored_list.append(data['home_team'])
                        goals_scored_list.append(data['home_score'])
                        x += 1
                    else:
                        x += 1

        return goals_scored_list

我想要的输出将是一个列表,其中包含每个独特团队的列表,该列表将包含国家名称、进球数和失球数:

[['England',1,1],['Scotland',0,2],[...]]

我想我需要为每个独特的国家/地区创建一个列表,也许使用类似的东西

if country not in data['home_team']:
    goals_scored_list.append(data['home_team'][x]

但我相信有一种更复杂的方法可以实现我的目标。

标签: pythonlist

解决方案


我相信这应该有效:

class Team:
    def __init__(self,name):
        self.name = name
        self.wins = 0
        self.losses = 0


    def addEl(self,pos,score):
        try:
            score = int(score)
        except Exception as e:
            print e
        if pos:
            self.wins += score
        else:
            self.loss += score



def total_goals(start,end):
    d = {}
    for i in range(len(data)):
        date = int(data['date'][i])

        if date >= start and date <= end: #make sure it's in the params
            pos = int(data['home_score'][i]) > int(data['away_score'][i]) #true if home wins, false otherwise

            if data['home_team'][i] in d: #check if home_team already exists
                d[data['home_team']].addEl(pos,data['home_score'][i]) #add score to wins/losses
            else:
                d[data['home_team'][i]] = Team(data['home_score'][i])
                d[data['home_team'][i]].addEl(pos,data['home_score'][i])


            if data['away_team'][i] in d:
                d[data['away_team']].addEl(not(pos),data['away_score'][i])
            else:
                d[data['away_team'][i]] = Team(data['away_score'][i])
                d[data['away_team'][i]].addEl(not(pos),data['away_score'][i])


    return d

使用自定义类的优势在于可以让您添加更多特征,例如赢/输的游戏数、附加统计信息等。


推荐阅读