python - pandas:当单元格内容为列表/ NaN/字符串时,为每个元素创建一行
问题描述
嗨,我有一个类似下面的 df
index a b c d
0 xx aa av NaN
1 pp as ka [1,2,3,4]
2 pa aj q 1234
3 xq aq aq NaN
4 pn an kn [10,20,30,40]
5 px ax kx "00012"
我想转换成如下所示的
index a b c d d-separated
0 xx aa av NaN NaN
1 pp as ka [1,2,3,4] 1
2 pp as ka [1,2,3,4] 2
3 pp as ka [1,2,3,4] 3
4 pp as ka [1,2,3,4] 4
5 pa aj q 1234 1234
6 xq aq aq NaN NaN
7 pn an kn [10,20,30,40] 10
8 pn an kn [10,20,30,40] 20
9 pn an kn [10,20,30,40] 30
10 pn an kn [10,20,30,40] 40
11 px ax kx "00012" "00012"
我参考了
pandas:当单元格内容是列表时,为列表中的每个元素创建一行,然后
但是,由于我的情况与他们不同。该解决方案在我的示例中不起作用。谢谢您的帮助
解决方案
首先将数据框扩展到所需的大小,根据需要重复每一行:
df1 = df.loc[df.index.repeat([len(x) if isinstance(x,list) else 1 for x in df.d])]
现在取消列 d 并将其与上面的 df 连接起来
d_sep= pd.DataFrame({'d_Sep':sum([x if isinstance(x,list) else [x] for x in df.d],[])})
df2 = pd.concat([df1.reset_index(drop=True),d_sep],axis=1)
a b c d d_Sep
0 xx aa av NaN NaN
1 pp as ka [1, 2, 3, 4] 1
2 pp as ka [1, 2, 3, 4] 2
3 pp as ka [1, 2, 3, 4] 3
4 pp as ka [1, 2, 3, 4] 4
5 pa aj q 1234 1234
6 xq aq aq NaN NaN
7 pn an kn [10, 20, 30, 40] 10
8 pn an kn [10, 20, 30, 40] 20
9 pn an kn [10, 20, 30, 40] 30
10 pn an kn [10, 20, 30, 40] 40
11 px ax kx 00012 00012
推荐阅读
- ios - Simple Pie Chart in Core Animation
- typescript - How to get the type of global Array Object function;
- sql-server - T-SQL - I want sum for each year os sales (Adventureworks2014)
- algorithm - double vertical bar as an alternative to increment operator
- python - Iterate rows and find sum of rows not exceeding a number
- reactjs - eslint hoist never doesn't work in my react js app
- mysql - Delete rows that are in SUM function = 0 and with WHERE condition
- python - optional groups in regex to match different lines
- json - 无法在 ReactJS 中检索 JSON 数据
- parameters - Windbg - 将伪寄存器传递给扩展和脚本