python - 熊猫左加入返回更大的矩阵并且不工作
问题描述
我有 2 个数据帧,第一个在“station_anal”下面
count Start station number
index
31623 17105 31623
31258 11432 31258
31201 10194 31201
31200 9505 31200
31247 9145 31247
第二个数据帧“vt”是:
Start station number Start station
0 31214 17th & Corcoran St NW
1 31104 Adams Mill & Columbia Rd NW
2 31221 18th & M St NW
3 31111 10th & U St NW
4 31260 23rd & E St NW
station_anal 尺寸为 486x2
vt 大小为 8000x2
我的左连接命令是:
lj = pd.merge(station_anal, vt, how = 'left', on = 'Start station number')
两列的 dtypes 相同,即 int64
但是 lj 返回:
lj.head()
count Start station number Start station
0 17105 31623 Columbus Circle / Union Station
1 17105 31623 Columbus Circle / Union Station
2 17105 31623 Columbus Circle / Union Station
3 17105 31623 Columbus Circle / Union Station
4 17105 31623 Columbus Circle / Union Station
大小 8000x3
没有意义,因为我的理解是左连接结果矩阵行大小在这种情况下始终是第一个数据帧 486
解决方案
让我们使用地图:
station_anal['起始站'] = station_anal['起始站号']
.map(vt.set_index('起始站号')['起始站'])
更新删除重复然后映射:
mapper = vt.drop_duplicates('Start Station Number')\
.set_index('Start station number')['Start station']
station_anal['Start Station'] = station_anal['Start station number']\
.map(mapper)