首页 > 解决方案 > 如何根据另一列添加到熊猫列

问题描述

目前我有一张看起来像这样的桌子

ID       Previous_Injuries    Currently_Injured      Injury_Type
1            Nan                      0                  Nan
1            Nan                      1                  Ankle
1            Nan                      0                  Nan
1            Nan                      1                  Wrist
1            Nan                      0                  Nan
1            Nan                      1                  Leg
1            Nan                      0                  Nan
2            Nan                      1                  Leg
2            Nan                      0                  Nan

我想添加到以前的伤害列,使我的表格看起来像这样:

ID       Previous_Injuries    Currently_Injured      Injury_Type
1            Nan                      0                  Nan
1            Nan                      1                  Ankle
1            [Ankle]                  0                  Nan
1            [Ankle]                  1                  Wrist
1            [Ankle,Wrist]            0                  Nan
1            [Ankle,Wrist]            1                  Leg
1            [Ankle,Wrist,Leg]        0                  Nan
2            Nan                      1                  Leg
2            [Leg]                    0                  Nan

如何在熊猫中实现这种列?最好以列表的形式进行吗?

谢谢!

标签: pythonpandasdataframe

解决方案


我们可以shift使用cumsum,然后split是字符串,注意这里你使用的是Nan(string type) ,它不是np.nan

s=df.Injury_Type.shift().fillna('Nan').add(',').cumsum().str[:-1].str.split(',')
df['new']=[[y  for y in x if y != 'Nan'] for x in s ]
df
Out[322]: 
   ID Previous_Injuries  Currently_Injured Injury_Type                  new
0   1               Nan                  0         Nan                   []
1   1               Nan                  1       Ankle                   []
2   1               Nan                  0         Nan              [Ankle]
3   1               Nan                  1       Wrist              [Ankle]
4   1               Nan                  0         Nan       [Ankle, Wrist]
5   1               Nan                  1         Leg       [Ankle, Wrist]
6   1               Nan                  0         Nan  [Ankle, Wrist, Leg]

再换个问题!

l=[]
for name , dfx in df.groupby('ID'):
    s = dfx.Injury_Type.shift().fillna('Nan').add(',').cumsum().str[:-1].str.split(',')
    dfx['new'] = [[y for y in x if y != 'Nan'] for x in s]
    l.append(dfx)

pd.concat(l)

推荐阅读