首页 > 解决方案 > Python Pandas : Extend operation of a column if a condition matches

问题描述

I have two different dataframes, i.e.,

firstDF = pd.DataFrame([{'mac':1,'location':['kitchen']}])
predictedDF = pd.DataFrame([{'mac':1,'location':['lab']}])

If the mac column value of predictedDF contains in mac column value of firstDF , then location column value of firstDF should extend the location column of predictedDF and the result of firstDF should be,

firstDF
      mac    location
0     1      ['kitchen','lab']

I have tried with,

firstDF.loc[firstDF['mac'] == predictedDF['mac'], 'mac'] = firstDF.loc[firstDF['location'].extend(predictedDF['location']), 'location']

Whereas the same returns,

AttributeError: 'Series' object has no attribute 'extend'

标签: python-3.xpandas

解决方案


If lists in location columns first DataFrame.merge for one DataFrame and then join with + and DataFrame.pop for extract column (use and drop):

df = firstDF.merge(predictedDF, on='mac', how='left')
df['location'] = df.pop('location_x') + df.pop('location_y')
print (df)
   mac        location
0    1  [kitchen, lab]

Test with more values - if missing values then replace them to []:

firstDF = pd.DataFrame({'mac':[1, 2],'location':[['kitchen'],['kitchen']]})
predictedDF = pd.DataFrame([{'mac':1,'location':['lab']}])

df = firstDF.merge(predictedDF, on='mac', how='left').applymap(lambda x: x if x == x else [])
df['location'] = df.pop('location_x') + df.pop('location_y')
print (df)
   mac        location
0    1  [kitchen, lab]
1    2       [kitchen]

推荐阅读