首页 > 解决方案 > 从带有条件的熊猫系列中选择数据创建列表

问题描述

我想在我的 pandas.core.series.Series 的第一个元素中识别字符“AD”,然后在这个 pandas 系列的以下元素的相同位置获取值,以便创建一个包含这些值的列表。

# This is my pandas.core.series.Series
df.iloc[0, 8:]

FORMAT                 GT:AD:DP:GQ:PL
1165684808    0/1:7,7:14:99:131,0,131
369966783     0/1:6,8:14:99:141,0,107
373977569     0/0:18,0:18:54:0,54,442
373977829       0/0:6,0:6:18:0,18,148
373977873     0/0:12,0:12:36:0,36,297
373978069     0/0:14,0:14:42:0,42,346
373978077       0/0:8,0:8:24:0,24,198
373978079     0/0:14,0:14:42:0,42,346
373978129          0/0:2,0:2:6:0,6,48
373978131     0/0:14,0:14:42:0,42,346
373978148     0/0:13,0:13:39:0,39,321
373978159       0/0:8,0:8:24:0,24,198
373978276       0/0:8,0:8:24:0,24,198
373978284     0/0:12,0:12:36:0,36,297
373978296           0/0:0,0:0:.:0,0,0
373978307       0/0:6,0:6:18:0,18,148
373978311     0/0:12,0:12:36:0,36,297
373978317       0/0:6,0:6:18:0,18,148
373978320     0/0:16,0:16:48:0,48,387
373978323     0/0:18,0:18:48:0,48,720
373978346       0/0:8,0:8:24:0,24,198
373978353     0/0:12,0:12:30:0,30,450
373978581     0/0:14,0:14:36:0,36,540
373978640     0/0:19,0:19:57:0,57,470

期望输出

"7,7", "6,8", "18,0", ... , "19,0"

标签: pythonpandaslist

解决方案


你想要这个吗?

result = df['GT:AD:DP:GQ:PL'].str.strip().str.split(':').str[1].values

输出:

array(['7,7', '6,8', '18,0', '6,0', '12,0', '14,0', '8,0', '14,0', '2,0',
       '14,0', '13,0', '8,0', '8,0', '12,0', '0,0', '6,0', '12,0', '6,0',
       '16,0', '18,0', '8,0', '12,0', '14,0', '19,0'], dtype=object)

推荐阅读