首页 > 解决方案 > 正则表达式仅捕获一组或另一组,而不是两者

问题描述

我正在尝试使用正则表达式来捕获两个名称组。当我想捕获单个组时,我的代码似乎是正确的,但由于某种原因,将第二组添加到我的 finditer 调用时,它不会返回任何结果。

https://regex101.com/r/FDpAuU/1

示例文本:

146.204.224.152 - feest6811 [21/Jun/2019:15:45:24 -0700] "POST /incentivize HTTP/1.1" 302 4622
197.109.77.178 - kertzmann3129 [21/Jun/2019:15:45:25 -0700] "DELETE /virtual/solutions/target/web+services HTTP/2.0" 203 26554
156.127.178.177 - okuneva5222 [21/Jun/2019:15:45:27 -0700] "DELETE /interactive/transparent/niches/revolutionize HTTP/1.1" 416 14701

第一捕获组:

text = """146.204.224.152 - feest6811 [21/Jun/2019:15:45:24 -0700] "POST /incentivize HTTP/1.1" 302 4622
    197.109.77.178 - kertzmann3129 [21/Jun/2019:15:45:25 -0700] "DELETE /virtual/solutions/target/web+services HTTP/2.0" 203 26554
    156.127.178.177 - okuneva5222 [21/Jun/2019:15:45:27 -0700] "DELETE /interactive/transparent/niches/revolutionize HTTP/1.1" 416 14701"""
item = re.findall("(?P<host>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})",text )
item

第二捕获组:

text = """146.204.224.152 - feest6811 [21/Jun/2019:15:45:24 -0700] "POST /incentivize HTTP/1.1" 302 4622
    197.109.77.178 - kertzmann3129 [21/Jun/2019:15:45:25 -0700] "DELETE /virtual/solutions/target/web+services HTTP/2.0" 203 26554
    156.127.178.177 - okuneva5222 [21/Jun/2019:15:45:27 -0700] "DELETE /interactive/transparent/niches/revolutionize HTTP/1.1" 416 14701"""
item = re.findall("(?P<user_name>[a-zA-Z]+[0-9]+)",text )
item

如何将两个捕获组组合成一个 findall (或 finditer) 调用?

标签: pythonregex

解决方案


加入群组.*?

(?P<host>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}).*?(?P<user_name>[a-zA-Z]+[0-9]+)

证明

解释

--------------------------------------------------------------------------------
  (?P<host>                  group and capture to \k<host>:
--------------------------------------------------------------------------------
    \d{1,3}                  digits (0-9) (between 1 and 3 times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
    \.                       '.'
--------------------------------------------------------------------------------
    \d{1,3}                  digits (0-9) (between 1 and 3 times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
    \.                       '.'
--------------------------------------------------------------------------------
    \d{1,3}                  digits (0-9) (between 1 and 3 times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
    \.                       '.'
--------------------------------------------------------------------------------
    \d{1,3}                  digits (0-9) (between 1 and 3 times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \k<host>
--------------------------------------------------------------------------------
  .*?                      any character except \n (0 or more times
                           (matching the least amount possible))
--------------------------------------------------------------------------------
  (?P<user_name>             group and capture to \k<user_name>:
--------------------------------------------------------------------------------
    [a-zA-Z]+                any character of: 'a' to 'z', 'A' to 'Z'
                             (1 or more times (matching the most
                             amount possible))
--------------------------------------------------------------------------------
    [0-9]+                   any character of: '0' to '9' (1 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )                        end of \k<user_name>

推荐阅读