首页 > 解决方案 > 如果在使用正则表达式之前已经存在,则查找句子的最后一个单词

问题描述

content = "这是我在第 1 页写的句子,但我在第 2 页写的"

答案:= 第 2 页

content = "This is the sentence I have written at Page 1 but I wrote in Page 2"
res1 = re.search("\.?([^\.]*Page[^\.]*)",contents) ---->Search for words after Page
if res1 is not None:
       a = res1.group(1)   
       extracted_page_sequence = re.sub('^(.*)(?=Page)',"", a)  ----->Gets Page 2
       print("Extracted sequence",extracted_page_sequence)

如果我使用此代码,我会得到“第 1 页,但我在第 2 页中写的”有解决方案。是否有任何方法在 Python 中使用正则表达式来获取第 2 页有解决方案。简而言之,需要从句子中获取最后一页

标签: python

解决方案


尝试这个:

all_last_occurrences = re.findall("[^\.]*(Page\s+\w+)\s*[^\.]*", content)

如果content仅由一个句子组成,例如"This is the sentence I have written at Page 1 but I wrote in Page 2",您会在它的末尾得到唯一的出现:

['Page 2']

如果content包含多个句子(每个句子与下一个句子之间用 a 分隔.),例如:

content = (
    "This is the sentence I have written at Page 1 but I wrote in Page 2. "
    "This is the sentence I have written at Page 3 but I wrote in Page 4. "
    "This is the sentence I have written at Page 5 but I wrote in Page 6. "
    "Page 7. "
    "ghghPage 8."
    "Page 9 Page 10 "
)

你得到:

['Page 2', 'Page 4', 'Page 6', 'Page 7', 'Page 8', 'Page 10']

推荐阅读