首页 > 解决方案 > re.findall 在 Python 中生成类似列表的数组

问题描述

我正在编写一个 python 脚本来从文本文件中提取几个特征。

输入文件具有以下结构:

ENTRY       M00001            Pathway   Module
NAME        Glycolysis (Embden-Meyerhof pathway), glucose => pyruvate
CLASS       Pathway modules; Carbohydrate metabolism; Central carbohydrate metabolism
PATHWAY     map00010  Glycolysis / Gluconeogenesis
            map01200  Carbon metabolism
            map01100  Metabolic pathways

///

我正在从“ENTRY”字段和“PATHWAY”字段中提取值。但是,当我将内容写入 PostgreSQL 11.0 表时,我得到以下输出。列类型是“字符变化”

id          map_id
{M00001}    {map00010,map01200,map01100}
{M00002}    {map00010,map01200,map01230,map01100}
{M00003}    {map00010,map00020,map01100}
{M00004}    {map00030,map01200,map01100,map01120}
{M00005}    {map00030,map00230,map01200,map01230,map01100}

下面的代码生成上面的输出:

cursor = conn.cursor()
dict = {}
with open ('file') as f:                                                                                   
    for line in f:
        if(re.search("^[A-Z]", line) ):
            key, value = re.split("\s+", line, 1)
            dict[key] = value
        elif(re.search("^\s+", line)):
            dict[key] = dict[key] + line
        elif(re.search("^///", line)):
            e = dict['ENTRY']
            string = ''.join(e)
            id = re.findall(r"(^[A-Za-z+]\d+)", string)                        
            map_id = re.findall(r"(map\d+)\s+.*",dict['PATHWAY'])
             
            cursor.execute("INSERT INTO tbl (id, map_id) VALUES (%s, %s)",(id, map_id))


conn.commit()
conn.close()
cursor.close()

预期的输出是:

id          map_id
M00001  map00010,map01200,map01100
M00002  map00010,map01200,map01230,map01100
M00003  map00010,map00020,map01100
M00004  map00030,map01200,map01100,map01120
M00005  map00030,map00230,map01200,map01230,map01100

非常感谢任何帮助

标签: pythonregexpostgresqlfindall

解决方案


推荐阅读