python - 使用函数在 df 中添加列
问题描述
Date Visitor V_PTS Home H_PTS \
0 2012-10-30 19:00:00 Washington Wizards 84 Cleveland Cavaliers 94
1 2012-10-30 19:30:00 Dallas Mavericks 99 Los Angeles Lakers 91
2 2012-10-30 20:00:00 Boston Celtics 107 Miami Heat 120
3 2012-10-31 19:00:00 Sacramento Kings 87 Chicago Bulls 93
4 2012-10-31 19:30:00 Houston Rockets 105 Detroit Pistons 96
尝试添加到抓取的数据集以对 NBA 比赛上座率进行分析。我正在尝试添加一些列,例如竞技场播放量和容量。这是我为添加竞技场而编写的一段函数。有一个更好的方法吗?我有日期时间中的日期,所以我将如何正确提取年份以将正确的竞技场分配给在过去几年中建造新竞技场的球队(萨克拉门托国王队)。还有没有办法为此增加体育场容量并用一块石头杀死两只鸟而不是创造另一个功能?
def label_arena (hometeam):
if hometeam == 'Toronto Raptors' :
return 'Air Canada Centre'
if hometeam == 'Miami Heat' :
return 'American Airlines Arena'
if hometeam == 'Dallas Mavericks' :
return 'American Airlines Center'
if hometeam == 'Orlando Magic' :
return 'Amway Center'
if hometeam == 'San Antonio Spurs' :
return 'AT&T Center'
if hometeam == 'Indiana Pacers' :
return 'Bankers Life Fieldhouse'
if hometeam == 'Brooklyn Nets' :
return 'Barclays Center'
if hometeam == 'Milwaukee Bucks' :
return 'Bradley Center'
if hometeam == 'Washington Wizards' :
return 'Capital One Arena'
if hometeam == 'Oklahoma City Thunder' :
return 'Chesapeake Energy Arena'
if hometeam == 'Memphis Grizzlies' :
return 'FedExForum'
if hometeam == 'Sacramento Kings' and df['Date'] < 2016:
return 'Sleep Train Arena'
if hometeam == 'Sacramento Kings' and df['Date'] > 2016:
return 'Golden 1 Center'
解决方案
这是你可以做的来简化你的逻辑:
import pandas as pd
df = pd.DataFrame({'Date': ['2012-10-30', '2012-10-30', '2012-10-30',
'2012-10-31', '2017-10-31'],
'Home': ['Toronto Raptors', 'Los Angeles Lakers', 'Miami Heat',
'Sacramento Kings', 'Sacramento Kings']})
df['Date'] = pd.to_datetime(df['Date'])
d = {'Toronto Raptors': 'Air Canada Centre',
'Los Angeles Lakers': 'Staples Center',
'Miami Heat': 'American Airlines Arena'}
# general criteria
df['Arena'] = df['Home'].map(d)
# custom criteria
df.loc[(df['Home'] == 'Sacramento Kings') &
(df['Date'].dt.year < 2016), 'Arena'] = 'Sleep Train Arena'
df.loc[(df['Home'] == 'Sacramento Kings') &
(df['Date'].dt.year >= 2016), 'Arena'] = 'Golden 1 Center'
print(df)
Date Home Arena
0 2012-10-30 Toronto Raptors Air Canada Centre
1 2012-10-30 Los Angeles Lakers Staples Center
2 2012-10-30 Miami Heat American Airlines Arena
3 2012-10-31 Sacramento Kings Sleep Train Arena
4 2017-10-31 Sacramento Kings Golden 1 Center
推荐阅读
- java - spring boot:在控制台中打印的堆栈跟踪不在日志文件中
- ios - 写入后立即损坏领域条目
- google-calendar-api - 如何从 Google 登录获取刷新令牌?
- angular - 获取对 html 元素内的属性的引用
- sql - 如何在 SQL 中进行聚类和消除 - Google BigQuery
- angular - How can I use node/npm-modules in angular project?
- ios - How to handle not used functions from delegates in view controllers
- ios - UILabel 不会多行显示
- python - 从列表中查找基因名称到数据框
- javascript - Downloading a file using a Blob created client side does not work as opposed to when getting the Blob from the server