首页 > 解决方案 > 正则表达式:匹配模式失败

问题描述

我正在尝试在 python 中将模式与 re 匹配,但无论我如何尝试,我似乎都无法匹配。

这是我的匹配模式:

def get_report_date(report):
    report_data = {}
    with open(report, 'r') as f:
        report_date = re.findall(f'([Q\d \d\d\d\d\s])', f.read())[0]
        pprint(report_date)
        report_data.update({f"{report_date.replace(' ', '_')}": report})
        return report_data

和我要匹配的文件的一部分:

(In millions, except number of shares which are reflected in thousands and per share amounts) 

See accompanying Notes to Condensed Consolidated Financial Statements. 

Apple Inc. | Q2 2018 Form 10-Q | 1 Apple Inc. CONDENSED CONSOLIDATED STATEMENTS OF COMPREHENSIVE INCOME (Unaudited)

我正在尝试刮Q2 2018

但我不断收到空字符串。

标签: pythonregex

解决方案


正则表达式:r'(Q\d\s\d+\s)'

解释:

  1. r原始字符串的前缀
  2. Q匹配Q季度
  3. \d匹配之后的季度数
  4. \s匹配空间
  5. \d+匹配多个年份的数字
  6. \s匹配空间

例子:

import re

text = """(In millions, except number of shares which are reflected in thousands and per share amounts)

See accompanying Notes to Condensed Consolidated Financial Statements.

Apple Inc. | Q2 2018 Form 10-Q | 1 Apple Inc. CONDENSED CONSOLIDATED STATEMENTS OF COMPREHENSIVE INCOME (Unaudited)"""


x = re.findall(r'(Q\d\s\d+\s)', text)[0]

# Q2 2018 
print(x)

代码修复:

def get_report_date(report):
    report_data = {}
    with open(report, 'r') as f:
        report_date = re.findall(r'(Q\d\s\d+\s)', f.read())[0]
        pprint(report_date)
        report_data.update({f"{report_date.replace(' ', '_')}": report})
        return report_data

推荐阅读