python - Python Pandas - 将列表转换为系列
问题描述
我有一个 excel 数据集,如下所示:
用于复制目的:
ID buffer
LocalHub@3c183d50 [intraCity_Simulator.Parcel@55078545, intraCity_Simulator.Parcel@75b895dd, intraCity_Simulator.Parcel@44227899, intraCity_Simulator.Parcel@696b0129, intraCity_Simulator.Parcel@86ec871, intraCity_Simulator.Parcel@7a0d8542, intraCity_Simulator.Parcel@67a58fba]
LocalHub@d3a0fbe [intraCity_Simulator.Parcel@61b9a28c, intraCity_Simulator.Parcel@1b5d2e8b, intraCity_Simulator.Parcel@65911201, intraCity_Simulator.Parcel@2e53ab95, intraCity_Simulator.Parcel@464b73fa, intraCity_Simulator.Parcel@640ff28a, intraCity_Simulator.Parcel@77fc8d6c, intraCity_Simulator.Parcel@609051b0, intraCity_Simulator.Parcel@25e0c299, intraCity_Simulator.Parcel@436af74b, intraCity_Simulator.Parcel@24c3fb2, intraCity_Simulator.Parcel@130592c8, intraCity_Simulator.Parcel@444d20b1, intraCity_Simulator.Parcel@6d59d5b2, intraCity_Simulator.Parcel@764a25d3, intraCity_Simulator.Parcel@4bdd2c62]
我想重新排列列表值并将其显示为与 ID 对应的列,例如
ID buffer
LocalHub@3c183d50 intraCity_Simulator.Parcel@55078545
LocalHub@3c183d50 intraCity_Simulator.Parcel@75b895dd
... ...
解决方案
pandas 0.25+ 的解决方案是使用列表的值删除,[]
然后,最后使用是默认值:Series.str.strip
Series.str.split
DataFrame.explode
DataFrame.reset_index
drop=True
RangeIndex
df = (df.assign(buffer = df['buffer'].str.strip('[]').str.split(','))
.explode('buffer')
.reset_index(drop=True))
print (df)
ID buffer
0 LocalHub@3c183d50 intraCity_Simulator.Parcel@55078545
1 LocalHub@3c183d50 intraCity_Simulator.Parcel@75b895dd
2 LocalHub@3c183d50 intraCity_Simulator.Parcel@44227899
3 LocalHub@3c183d50 intraCity_Simulator.Parcel@696b0129
4 LocalHub@3c183d50 intraCity_Simulator.Parcel@86ec871
5 LocalHub@3c183d50 intraCity_Simulator.Parcel@7a0d8542
6 LocalHub@3c183d50 intraCity_Simulator.Parcel@67a58fba
7 LocalHub@d3a0fbe inraCity_Simulator.Parcel@61b9a28c
8 LocalHub@d3a0fbe intraCity_Simulator.Parcel@1b5d2e8b
9 LocalHub@d3a0fbe intraCity_Simulator.Parcel@65911201
10 LocalHub@d3a0fbe intraCity_Simulator.Parcel@2e53ab95
11 LocalHub@d3a0fbe intraCity_Simulator.Parcel@464b73fa
12 LocalHub@d3a0fbe intraCity_Simulator.Parcel@640ff28a
13 LocalHub@d3a0fbe intraCity_Simulator.Parcel@77fc8d6c
14 LocalHub@d3a0fbe intraCity_Simulator.Parcel@609051b0
15 LocalHub@d3a0fbe intraCity_Simulator.Parcel@25e0c299
16 LocalHub@d3a0fbe intraCity_Simulator.Parcel@436af74b
17 LocalHub@d3a0fbe intraCity_Simulator.Parcel@24c3fb2
18 LocalHub@d3a0fbe intraCity_Simulator.Parcel@130592c8
19 LocalHub@d3a0fbe intraCity_Simulator.Parcel@444d20b1
20 LocalHub@d3a0fbe intraCity_Simulator.Parcel@6d59d5b2
21 LocalHub@d3a0fbe intraCity_Simulator.Parcel@764a25d3
22 LocalHub@d3a0fbe intraCity_Simulator.Parcel@4bdd2c62
以下熊猫版本的解决方案是repeat
由列表长度使用Series.str.len
:
from itertools import chain
splitted = df['buffer'].str.strip('[]').str.split(',')
df = pd.DataFrame({
'ID' : df['ID'].values.repeat(splitted.str.len()),
'buffer' : list(chain.from_iterable(splitted.tolist()))
})
print (df)
ID buffer
0 LocalHub@3c183d50 intraCity_Simulator.Parcel@55078545
1 LocalHub@3c183d50 intraCity_Simulator.Parcel@75b895dd
2 LocalHub@3c183d50 intraCity_Simulator.Parcel@44227899
3 LocalHub@3c183d50 intraCity_Simulator.Parcel@696b0129
4 LocalHub@3c183d50 intraCity_Simulator.Parcel@86ec871
5 LocalHub@3c183d50 intraCity_Simulator.Parcel@7a0d8542
6 LocalHub@3c183d50 intraCity_Simulator.Parcel@67a58fba
7 LocalHub@d3a0fbe inraCity_Simulator.Parcel@61b9a28c
8 LocalHub@d3a0fbe intraCity_Simulator.Parcel@1b5d2e8b
9 LocalHub@d3a0fbe intraCity_Simulator.Parcel@65911201
10 LocalHub@d3a0fbe intraCity_Simulator.Parcel@2e53ab95
11 LocalHub@d3a0fbe intraCity_Simulator.Parcel@464b73fa
12 LocalHub@d3a0fbe intraCity_Simulator.Parcel@640ff28a
13 LocalHub@d3a0fbe intraCity_Simulator.Parcel@77fc8d6c
14 LocalHub@d3a0fbe intraCity_Simulator.Parcel@609051b0
15 LocalHub@d3a0fbe intraCity_Simulator.Parcel@25e0c299
16 LocalHub@d3a0fbe intraCity_Simulator.Parcel@436af74b
17 LocalHub@d3a0fbe intraCity_Simulator.Parcel@24c3fb2
18 LocalHub@d3a0fbe intraCity_Simulator.Parcel@130592c8
19 LocalHub@d3a0fbe intraCity_Simulator.Parcel@444d20b1
20 LocalHub@d3a0fbe intraCity_Simulator.Parcel@6d59d5b2
21 LocalHub@d3a0fbe intraCity_Simulator.Parcel@764a25d3
22 LocalHub@d3a0fbe intraCity_Simulator.Parcel@4bdd2c62
推荐阅读
- mysql - ERR 1288,删除重复行时数据库不可更新
- php - mysqli insert with bind_param 静默失败($query->execute() 返回 false
- python - Python 和 R 包:文档在一起
- python - TypeError:列表索引必须是整数或切片,而不是 Python 中的 str
- html - 如何在网站上添加引导日期选择器?
- flutter - 使用来自未来的回报
作为警报小部件的标题 - c++ - 模板专业化没有匹配的函数调用
- android - 未安装 Node.js。在 Mac 上将 Amplify AWS 添加到 Android Studio
- r - 将值绘制为未堆叠的条形图
- python - Dask Delayed 函数在每次调用时逐渐变慢。不是内存问题