首页 > 解决方案 > Python中多个条件下删除带括号的内容

问题描述

给定如下列表:

l = ['hydrogenated benzene (purity: 99.9 density (g/cm3), produced in ZB): SD', 
    'Car board price (tax included): JT Port', 
    'Ex-factory price (low-end price): Triethanolamine (85% commercial grade): North'
    ]

我想得到预期的结果如下:

['hydrogenated benzene: SD', 'Car board price: JT Port', 'Ex-factory price: Triethanolamine: North']

使用以下代码:

def remove_extra(content):
    pat1 = '[\s]'  # remove space
    pat2 = '\(.*\)' # remove content within parentheses
    combined_pat = r'|'.join((pat2, pat3))
    return re.sub(combined_pat, '', str(content))
[remove_extra(item) for item in l]

它生成:

['hydrogenated benzene : SD',
 'Car board price : JT Port',
 'Ex-factory price : North']

您可能会注意到,结果的最后一个元素'Ex-factory price : North'并不像预期的那样,我怎么能达到我所需要的?谢谢。

标签: python-3.xpandasre

解决方案


\s*您可以在之前使用删除可选空格来修改链接解决方案(

#https://stackoverflow.com/a/37538815/2901002 
def remove_text_between_parens(text):
    n = 1  # run at least once
    while n:
        text, n = re.subn(r'\s*\([^()]*\)', '', text) #remove non-nested/flat balanced parts
    return text

a = [remove_text_between_parens(item) for item in l]
print (a)

['hydrogenated benzene: SD', 
 'Car board price: JT Port', 
 'Ex-factory price: Triethanolamine: North']

推荐阅读