首页 > 解决方案 > Python,Pandas:使用类似 isin() 的功能,但不要忽略输入列表中的重复项

问题描述

我正在尝试df_in根据索引列表过滤输入数据框 ()。索引列表包含重复项,我希望我的输出 df_out 包含特定索引的所有出现。正如预期的那样,isin()每个索引只给我一个条目。

如何尝试而不是忽略重复项并获得类似于的输出df_out_desired

import pandas as pd
import numpy as np

df_in = pd.DataFrame(index=np.arange(4), data={'A':[1,2,3,4],'B':[10,20,30,40]})

indices_needed_list = pd.Series([1,2,3,3,3])

# In the output df, I do not particularly care about the 'index' from the df_in
df_out = df_in[df_in.index.isin(indices_needed_list)].reset_index()
# With isin, as expected, I only get a single entry for each occurence of index in indices_needed_list

# What I am trying to get is an output df that has many rows and occurences of df_in index as in the indices_needed_list
temp = df_out[df_out['index'] == 3]

# This is what I would like to try and get
df_out_desired = pd.concat([df_out, df_out[df_out['index']==3], df_out[df_out['index']==3]])

谢谢!

标签: pythonpandasnumpy

解决方案


查看reindex

df_out_desired = df_in.reindex(indices_needed_list)
df_out_desired 
Out[177]: 
   A   B
1  2  20
2  3  30
3  4  40
3  4  40
3  4  40

推荐阅读