首页 > 解决方案 > 使用 Series 中的行索引和列名捕获 DataFrame 值

问题描述

我有一个熊猫数据框,差异:

diff
Out[100]: 
         p0_4       p5_7       p8_9      p10_14        p15     p16_17  \
0     0.24337  -0.370463     0.3706 -0.00874218  -0.121252   0.104316   
1   -0.168109   0.242252 -0.0203263    0.128852  -0.232532   0.285443   
2      0.4673  -0.219911   0.127271    0.263905   0.369508  -0.393832   
3    0.341078  -0.440877   0.104191    0.498076   0.376636  0.0629645   
4   -0.317727   0.460738  -0.190723     0.49734   0.123088  -0.220683   
5   0.0754969  -0.170285  -0.311576   -0.430475   0.155444   0.475948   
6   -0.357219  -0.109822   0.193669    0.408703 -0.0321149   0.434733   
7    0.111809   0.377264  -0.495888   0.0323004  -0.266074  -0.189091   
8   -0.344382   0.422581  -0.180323   -0.483758   0.300689  -0.257307   
9    0.216073   0.201361  -0.460732    0.470577  0.0430313    0.23239   
10  -0.132846  -0.201377   0.262351    0.407778   0.260918  -0.385248   
11  -0.134866  -0.273007   0.229133     0.45127   0.128607  -0.473758   
12   0.455207 -0.0761055   0.103605    0.388977  -0.221584  -0.147312   

       p18_19     p20_24     p25_29     p30_44     p45_59    p60_64  \
0    0.496822  -0.101924   0.327643  -0.154493   0.250398  0.181304   
1    0.206437  -0.324619  0.0746302    0.41683   0.493975 -0.400399   
2   -0.399293   0.448077  0.0658686  -0.443242  -0.428764 -0.426445   
3   -0.404962    0.11792   0.033909  -0.329837  -0.236263 -0.377642   
4    0.444808  -0.239306  0.0750415   0.432904  -0.497133  0.357713   
5    0.251119  -0.103679  0.0216501   0.488226 -0.0499458 -0.186201   
6   -0.342739   -0.41421   0.270395  -0.124238  -0.497355  0.368779   
7    0.207716 -0.0974368  -0.134803  -0.379455  -0.369576  0.473811   
8   0.0780937  -0.436464  -0.394125   -0.47593   0.232315  0.362133   
9    -0.15832 -0.0207144   0.153792 -0.0770877   0.204598 -0.369145   
10   0.381227 -0.0933909  0.0629903  -0.447975  -0.377035 -0.452033   
11   0.242461  0.0371317  -0.166247  -0.446032  -0.330998   0.26982   
12  0.0813222   0.411294   0.416006  -0.186285   0.396582 -0.242761   

       p65_74     p75_84      p85_89    p90plus  
0   -0.104913   0.393034   -0.106004  -0.399696  
1   -0.432464  -0.255217   -0.471858   0.457105  
2   -0.457699   0.368346  0.00154132  -0.342632  
3    0.336074  -0.431123  0.00594774   0.343908  
4    0.113632  -0.402119   -0.459227  -0.178347  
5    -0.22454   0.357067   0.0985315  -0.446782  
6   0.0974457   0.337944  -0.0866055  -0.147366  
7    0.438395  0.0448273    0.150045  0.0961568  
8   0.0613137  -0.177715    0.168945   0.123933  
9   -0.142751  -0.134487   -0.137383 -0.0212019  
10 -0.0246049  -0.344517   -0.209677    0.29344  
11  -0.131337   0.144234   -0.273662  -0.272751  
12   0.451114  -0.462659   -0.264847  -0.102552  

其中一列标记为“p45_49”,行索引为 0 - 12。

我有一个包含以下信息的系列对象:

2     p45_59
8     p45_59
10    p45_59
11    p45_59
dtype: object

有没有办法在索引 2,列 p45_49 处获取 diff 的值;索引 8,列 p45_49 等。所以我最终得到:

2     p45_59  -0.428764
8     p45_59   0.232315
10    p45_59  -0.377035
11    p45_59  -0.330998

提前谢谢你。

标签: pythonpandasdataframeobjectseries

解决方案


DataFrame.lookupSeries.to_frame和用于DataFrame.assign新列:

print (df.lookup(s.index, s))
[-0.428764  0.232315 -0.377035 -0.330998]

df = s.to_frame('a').assign(b=df.lookup(s.index, s))
print (df)
         a         b
2   p45_59 -0.428764
8   p45_59  0.232315
10  p45_59 -0.377035
11  p45_59 -0.330998

如果可能某些值不匹配,则解决方案失败。DataFrame.join这是and的替代方法DataFrame.stack

print (s)
2     p45_59
8     p45_59
10    p45_59
11       val <- no match
dtype: object

df1 = s.reset_index()
df1.columns = ['idx','a']

df1 = df1.join(df.stack().rename('b'), on = ['idx','a'])
print (df1)

   idx       a         b
0    2  p45_59 -0.428764
1    8  p45_59  0.232315
2   10  p45_59 -0.377035
3   11     val       NaN

推荐阅读