首页 > 解决方案 > 使用 regex 删除首字母缩写词,基于括号后面的大写字符

问题描述

如何删除以下内容:

不是括号之间的单词,以大写开头,后跟小写,例如“(鲍比)”或“(鲍勃去海滩……)”——>这是我正在努力解决的部分。


text = ['(ABC went to the beach', 'The girl (ABC-2A) is walking', 'The dog (Bobby) is being walked', 'They are there (ABC)' ]
for string in text:
  cleaned_acronyms = re.sub(r'\([A-Z]*\)?', '', string)
  print(cleaned_acronyms)

#current output:
>> 'went to the beach' #Correct
>>'The girl -2A) is walking' #Not correct
>>'The dog obby) is being walked' #Not correct
>>'They are there' #Correct


#desired & correct output:
>> 'went to the beach'
>>'The girl is walking'
>>'The dog (Bobby) is being walked' #(Bobby) is NOT an acronym (uppercase+lowercase)
>>'They are there'

标签: pythonregexstringuppercasere

解决方案


\([A-Z\-0-9]{2,}\)?在以下上下文中使用:

import re

text = ['(ABC went to the beach', 'The girl (ABC-2A) is walking', 'The dog (Bobby) is being walked', 'They are there (ABC)' ]
for string in text:
  cleaned_acronyms = re.sub(r'\([A-Z\-0-9]{2,}\)?', '', string)
  print(cleaned_acronyms)

我得到这些结果:

' went to the beach'
'The girl  is walking'
'The dog (Bobby) is being walked'
'They are there '

推荐阅读