python - Pandas 地图 --- ValueError: 长度不匹配
问题描述
我在内存中有两个 CSV 存储为数据帧:df1 和 df2
df1 有一列“OOSCUSTID” df2 有一列“FORCUSTID”
对于 df1 中的每一行:
其中 df1 中的 OOSCUSTID 值 == df2 中的 FORCUSTID 值,取 df2['KKLM'] 中的值,并将其存储在 df1['FOREIGN-KKLM'']
df1:
NO. OOSCUSTID # TRADES AVG PROFIT/LOSS
648500 -17 103 1305914.12
648483 -16 103 1305914.12
648502 -15 103 1305914.12
df2:
NO. FORCUSTID KKLM AVG PROFIT/LOSS
648495 0 6 1305914.12
648500 -17 3 1305914.12
648483 -16 5 1305914.12
648502 -15 6 1305914.12
648484 -14 7 1305914.12
648482 -13 8 1305914.12
648501 -12 20.34 1305914.12
648486 -9 4534 1305914.12
648487 -8 103 1305914.12
下面的代码产生错误:
ValueError:长度不匹配:预期为 9 行,收到长度为 1 的数组
checkstats = ["FOREIGN-KKLM"]
c = ["KKLM"]
ooscolfor = ["FORCUSTID"]
ooscolmain = ["OOSCUSTID"]
df1[checkstats] = df2.set_index([ooscolfor])[c].reindex(df1[ooscolmain]).array
编辑 2 修改 df1 和 df2 并使用代码:
df1['FOREIGN-KKLM'] = df1.merge(df2, left_on='OOSCUSTID',
right_on='FORCUSTID')['KKLM']
产生不一致 - 当 #3 应该是 Nan 而 #4 应该是 4534 时:
NO. OOSCUSTID # TRADES AVG PROFIT/LOSS FOREIGN-KKLM
0 648500 -17 103 1305914.12 3.0
1 648483 -16 103 1305914.12 5.0
2 648502 -15 103 1305914.12 6.0
3 545 4 44 44.00 4534.0
4 22 -9 22 22.00 NaN
修改了df:
df1:
NO. OOSCUSTID # TRADES AVG PROFIT/LOSS
648500 -17 103 1305914.12
648483 -16 103 1305914.12
648502 -15 103 1305914.12
545 4 44 44
22 -9 22 22
df2:
NO. FORCUSTID KKLM AVG PROFIT/LOSS
648495 0 6 1305914.12
648500 -17 3 1305914.12
648483 -16 5 1305914.12
648502 -15 6 1305914.12
648484 -14 7 1305914.12
648482 -13 8 1305914.12
648501 -12 20.34 1305914.12
648486 -9 4534 1305914.12
648487 -8 103 1305914.12
解决方案
merge()
方法:
df1['FOREIGN-KKLM'] = df1.merge(df2, left_on='OOSCUSTID',
right_on='FORCUSTID',
how='left')['KKLM']
Print(df1)
NO. OOSCUSTID FOREIGN-KKLM
0 648500 -17 3.0
1 648483 -16 5.0
2 648502 -15 6.0
3 545 4 NaN
4 22 -9 4534.0