首页 > 解决方案 > Slicing dataframe by comparing all values of one column to all values of column of another dataframe

问题描述

I have a question regarding slicing dataframes. I have two dataframes: halo_field with indexes 3447, 4024...

           H_masa  N_subs      ...                 H_z             rh
3447  1.066437e+11       1      ...        88419.632812  160354.430049
4024  4.423280e+11       1      ...        49013.289062   65239.433084
4958  3.171903e+11       1      ...        23239.701172   48248.401956
5749  2.817211e+11       1      ...        46585.765625   65032.216212
6512  2.471275e+11       1      ...        93403.398438  123058.838527

and I have dataframe subhalo with one of the columns named 'halo_index' indexing into the dataframe halo from which the halo_field is the slice (thus we have such halo_field indexes) - this is printout of subhalo.halo_index (on the right):

0                0
1                0
2                0
3                0
4                0
            ...   
4366516    7713551
4366517    7713552
4366518    7713553

I would like to slice subhalo dataframe into the dataframe subhalo_field so that it only contains rows with halo_index column value which is also contained in the halo_field.index. Problem is that of course those two columns are not the same length and I can't do it like this (comparing row to row vs. comparing all values of one column to all values of another):

subhalo_field=subhalo[subhalo.halo_index==halo_field.index].copy()

I get this error:

File "group_sh.py", line 139, in <module>
subhalo_field=subhalo[subhalo.halo_index==halo_field.index].copy()
File "/usr/local/lib/python2.7/dist-packages/pandas/core/ops.py", line 1223, in wrapper
raise ValueError('Lengths must match to compare')
ValueError: Lengths must match to compare

How can I slice my subhalo dataframe so I can compare subhalo.halo_index to halo_field.index and copy just those subhalos into subhalo_fields that have maching halo_index and halo_field.index?

标签: pythonpandasdataframeslice

解决方案


如果我对您的理解正确,amerge的索引halo_fieldhalo_indexsubhalo可能就是您要查找的内容(这默认为内部连接行为):

halo_field.merge(subhalo, left_index=True, right_index=False, right_on='halo_index')

推荐阅读