首页 > 解决方案 > pandas:当单元格内容为列表/ NaN/字符串时,为每个元素创建一行

问题描述

嗨,我有一个类似下面的 df

index a  b  c  d
0     xx aa av NaN
1     pp as ka [1,2,3,4]
2     pa aj q  1234
3     xq aq aq NaN
4     pn an kn [10,20,30,40]
5     px ax kx "00012" 

我想转换成如下所示的

index a  b  c  d              d-separated
0     xx aa av NaN            NaN
1     pp as ka [1,2,3,4]      1
2     pp as ka [1,2,3,4]      2
3     pp as ka [1,2,3,4]      3
4     pp as ka [1,2,3,4]      4
5     pa aj q  1234           1234
6     xq aq aq NaN            NaN
7     pn an kn [10,20,30,40]  10
8     pn an kn [10,20,30,40]  20
9     pn an kn [10,20,30,40]  30
10    pn an kn [10,20,30,40]  40
11    px ax kx "00012"        "00012"

我参考了

pandas:当单元格内容是列表时,为列表中的每个元素创建一行,然后

拆分(分解)熊猫数据框字符串条目以分隔行

但是,由于我的情况与他们不同。该解决方案在我的示例中不起作用。谢谢您的帮助

标签: pythonpython-3.xpandas

解决方案


首先将数据框扩展到所需的大小,根据需要重复每一行:

df1 = df.loc[df.index.repeat([len(x) if isinstance(x,list) else 1 for x in df.d])]

现在取消列 d 并将其与上面的 df 连接起来

d_sep= pd.DataFrame({'d_Sep':sum([x if isinstance(x,list) else [x] for x in df.d],[])})

df2 = pd.concat([df1.reset_index(drop=True),d_sep],axis=1)

   a   b   c                 d  d_Sep
0   xx  aa  av               NaN    NaN
1   pp  as  ka      [1, 2, 3, 4]      1
2   pp  as  ka      [1, 2, 3, 4]      2
3   pp  as  ka      [1, 2, 3, 4]      3
4   pp  as  ka      [1, 2, 3, 4]      4
5   pa  aj   q              1234   1234
6   xq  aq  aq               NaN    NaN
7   pn  an  kn  [10, 20, 30, 40]     10
8   pn  an  kn  [10, 20, 30, 40]     20
9   pn  an  kn  [10, 20, 30, 40]     30
10  pn  an  kn  [10, 20, 30, 40]     40
11  px  ax  kx             00012  00012

推荐阅读