首页 > 解决方案 > How to perform cross check between to different length columns in dataframes and create an a new dataframe?

问题描述

I have two dataframes,

df1:

pno | bno | report
1 | 12 | somereport2.pdf
11 | 12 | somereporter.pdf
12 | 12 | somereportf.pdf
11 | 12 | somereportwee.pdf
1 | 12 | somereport22.pdf
11 | 12 | somereport22.pdf

df2:

pno
11
12

I want to create a new df based on a column pno of df1 and df2. So df3:

pno | bno | report
11 | 12 | somereporter.pdf
12 | 12 | somereportf.pdf
11 | 12 | somereportwee.pdf
11 | 12 | somereport22.pdf

That is the new df will only have values that are in df2 pno column. I tried using mergefunction as

newdf = pd.merge(df1, df2, how="inner", on=["pno","pno"]

But it created some random shape with lot of missing values. I tried to do left join,

newdf = pd.merge(df1, df2, how="left", on=["pno","pno"]

But it kept all the values without cross checking.

Is there a way to crosscheck one column with another and only keep those values in new df?

标签: pythonpandasdataframe

解决方案


Use isin to filter and mask the rowms you want

df1[df1['pno'].isin(df2['pno'])]

推荐阅读