首页 > 解决方案 > Python多行模式搜索

问题描述

我有以下文本,我需要对其进行解析以提取所有三个值的组。对于这个特定示例,我需要这样的输出: [1,1,1],[2,2,2],[3,2,3],[4,2,4] 我试图使用这个 reg expr :

re.findall(r'measId \d+,[\n\r]measObjectId \d+[\n\r],reportConfigId \d+',output)

但它总是返回零结果。我已经尝试了多种带有 re.MULTILINE 标志的组合,但没有一个,但没有区别。我究竟做错了什么?有什么建议吗?

measIdToAddModList {
          {
            measId 1,
            measObjectId 1,
            reportConfigId 1
          },
          {
            measId 2,
            measObjectId 2,
            reportConfigId 2
          },
          {
            measId 3,
            measObjectId 2,
            reportConfigId 3
          },
          {
            measId 4,
            measObjectId 2,
            reportConfigId 4
          }

标签: pythonmultilinere

解决方案


Here is the most naive solution. It works only if exactly three fields are present:

re.findall(r'\{\s+(\w+\s+\d+),\s+(\w+\s+\d+),\s+(\w+\s+\d+)\s+}', s)
#[('measId 1', 'measObjectId 1', 'reportConfigId 1'), 
# ('measId 2', 'measObjectId 2', 'reportConfigId 2'), 
# ('measId 3', 'measObjectId 2', 'reportConfigId 3'), 
# ('measId 4', 'measObjectId 2', 'reportConfigId 4')]

Explanation:

\{          # Opening curly brace 
\s+         # One or more spaces
(\w+\s+\d+) # word, spaces, digits
,\s+        # comma, spaces
(\w+\s+\d+)
,\s+
(\w+\s+\d+)
\s+         # spaces
}           # Closing curly brace

推荐阅读