首页 > 解决方案 > python正则表达式:如何忽略不相关的匹配?

问题描述

我有一个文本,有一个句子包含“自从”这个词。我的尝试是使用正则表达式来提取单词“since”之后的文本,直到下一个和上一个时期。例如,文本是:

text = "I like to live in a big city. Today is Monday, since yesterday was Sunday."

我的正则表达式是

rule = re.compile(r'([a-zA-Z0-9\,\.\s\'])\bsince\b([a-zA-Z0-9\,\.\s\'])', re.IGNORECASE)
patterns = rule.match(text)

但是,patterns.group(1)返回I like to live in a big city. Today is Monday, 包含我不想要的句子,即我只想要Today is Monday, 。如何使用正则表达式来做到这一点?

标签: pythonregexre

解决方案


你可以使用这个正则表达式:

[^.]*? since [^.]*?\.

正则表达式演示

代码:

import re

text = "I like to live in a big city. Today is Monday, since yesterday was Sunday."
print (re.findall(r'[^.]*? since [^.]*?\.', text))

输出:

[' Today is Monday, since yesterday was Sunday.']

正则表达式详细信息:

  • [^.]*?: 匹配 0 个或多个不是点的字符
  • since: 匹配" since "
  • [^.]*?: 匹配 0 个或多个不是点的字符
  • \.: 匹配一个点

推荐阅读