python - 正则表达式:匹配模式失败
问题描述
我正在尝试在 python 中将模式与 re 匹配,但无论我如何尝试,我似乎都无法匹配。
这是我的匹配模式:
def get_report_date(report):
report_data = {}
with open(report, 'r') as f:
report_date = re.findall(f'([Q\d \d\d\d\d\s])', f.read())[0]
pprint(report_date)
report_data.update({f"{report_date.replace(' ', '_')}": report})
return report_data
和我要匹配的文件的一部分:
(In millions, except number of shares which are reflected in thousands and per share amounts)
See accompanying Notes to Condensed Consolidated Financial Statements.
Apple Inc. | Q2 2018 Form 10-Q | 1 Apple Inc. CONDENSED CONSOLIDATED STATEMENTS OF COMPREHENSIVE INCOME (Unaudited)
我正在尝试刮Q2 2018
但我不断收到空字符串。
解决方案
正则表达式:r'(Q\d\s\d+\s)'
解释:
r
原始字符串的前缀Q
匹配Q
季度\d
匹配之后的季度数\s
匹配空间\d+
匹配多个年份的数字\s
匹配空间
例子:
import re
text = """(In millions, except number of shares which are reflected in thousands and per share amounts)
See accompanying Notes to Condensed Consolidated Financial Statements.
Apple Inc. | Q2 2018 Form 10-Q | 1 Apple Inc. CONDENSED CONSOLIDATED STATEMENTS OF COMPREHENSIVE INCOME (Unaudited)"""
x = re.findall(r'(Q\d\s\d+\s)', text)[0]
# Q2 2018
print(x)
代码修复:
def get_report_date(report):
report_data = {}
with open(report, 'r') as f:
report_date = re.findall(r'(Q\d\s\d+\s)', f.read())[0]
pprint(report_date)
report_data.update({f"{report_date.replace(' ', '_')}": report})
return report_data