首页 > 解决方案 > 正则表达式匹配不包含符号的单词

问题描述

我正在尝试捕获某个模式之后的单词\d+",我想捕获该模式之后的所有单词,直到在单词中找到“&”或一个数字。

这就是我到目前为止所拥有的

import re
tests = ['the size will be 12" QTR', 'size is 7" gnh H&M','size is 12" GNH.M gm H&M', 'sizes are 12" QTR gm Best&SAP for all ages', 'size is 14" qtr 14GM']
for i in tests:
    temp = re.search('\d+"\s+([A-Za-z.\s]+)', i).groups()[0]
    print(temp)

预期的输出是

QTR
gnh
GNH.M gm
QTR gm
qtr

标签: pythonregex

解决方案


首先替换包含&with的单词的所有字符,&然后像这样运行您的代码:

import re
tests = ['the size will be 12" QTR', 'size is 7" gnh H&M','size is 12" GNH.M gm H&M', 'sizes are 12" QTR gm Best&SAP for all ages', 'size is 14" qtr 14GM']
for j,i in enumerate(tests):
    for word in i.split():
        if "&" in word:
            i = i.replace(word, '&'*len(word))
    tests[j] = i
    temp = re.search('\d+"\s+([A-Za-z.\s]+)', i).groups()[0]
    print(temp)

print(tests)

输出:

QTR
gnh 
GNH.M gm 
QTR gm 
qtr 

运行此代码后,您将tests list变成这样:

['the size will be 12" QTR',
 'size is 7" gnh &&&',
 'size is 12" GNH.M gm &&&',
 'sizes are 12" QTR gm &&&&&&&& for all ages',
 'size is 14" qtr 14GM']

推荐阅读