首页 > 解决方案 > 如何在 Dataframe 中根据 if 语句插入一行

问题描述

我有一个数据框 df,我想要 df1(都如下所示)。对于每个 ID 值,我希望代表两种饮料类型(啤酒和葡萄酒)。如果任何 ID 值没有这些,他们会将缺少的饮料类型和“未说明”插入“饮料”列。

df:    

ID    DrinkType    Drink

130   Beer         Fosters
130   Wine         Rose
130   Beer         Budweiser 
102   Beer         Fosters
120   Wine         Pinot Grigot
120   Beer         Budweiser 
99    Wine         Coke
75    Beer         Carling
75    Beer         Fosters


df1:    

ID    DrinkType    Drink

130   Beer         Fosters
130   Wine         Rose
130   Beer         Budweiser 
102   Beer         Fosters   
102   Wine         Not Stated
120   Wine         Pinot Grigot
120   Beer         Budweiser 
99    Wine         Coke   
99    Beer         Not Stated
75    Beer         Carling
75    Beer         Fosters
75    Wine         Not Stated

标签: dataframe

解决方案


我认为这是您需要的溶胶

import pandas as pd
df=pd.DataFrame({'Id':[130,130,130,102,120,120,99,75,75],'DrinkType': 
['Beer','Wine','Beer','Beer','Wine','Beer','Wine','Beer','Beer'],'Drink': 
                            ['Fosters','Rose','Budweiser','Fosters','PinotGrigot','Budweiser','Coke','Carling','Fosters']})

diff=df['Id'].unique()
print(diff)

all_cate={'Beer','Wine'}

for i in diff:
    inte=all_cate.intersection(set(df.loc[df['Id']==i,'DrinkType'].unique()))
    for j in all_cate:
        if(j not in inte):
            print((i,j,'NotStated'))
            df=pd.concat([df,pd.DataFrame({'Id':[i],'DrinkType':[j],'Drink': 
                                                                     ['NotStated']})])


df=df.reset_index(drop=True)

在此处输入图像描述


推荐阅读