python - 正则表达式仅捕获一组或另一组,而不是两者
问题描述
我正在尝试使用正则表达式来捕获两个名称组。当我想捕获单个组时,我的代码似乎是正确的,但由于某种原因,将第二组添加到我的 finditer 调用时,它不会返回任何结果。
https://regex101.com/r/FDpAuU/1
示例文本:
146.204.224.152 - feest6811 [21/Jun/2019:15:45:24 -0700] "POST /incentivize HTTP/1.1" 302 4622
197.109.77.178 - kertzmann3129 [21/Jun/2019:15:45:25 -0700] "DELETE /virtual/solutions/target/web+services HTTP/2.0" 203 26554
156.127.178.177 - okuneva5222 [21/Jun/2019:15:45:27 -0700] "DELETE /interactive/transparent/niches/revolutionize HTTP/1.1" 416 14701
第一捕获组:
text = """146.204.224.152 - feest6811 [21/Jun/2019:15:45:24 -0700] "POST /incentivize HTTP/1.1" 302 4622
197.109.77.178 - kertzmann3129 [21/Jun/2019:15:45:25 -0700] "DELETE /virtual/solutions/target/web+services HTTP/2.0" 203 26554
156.127.178.177 - okuneva5222 [21/Jun/2019:15:45:27 -0700] "DELETE /interactive/transparent/niches/revolutionize HTTP/1.1" 416 14701"""
item = re.findall("(?P<host>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})",text )
item
第二捕获组:
text = """146.204.224.152 - feest6811 [21/Jun/2019:15:45:24 -0700] "POST /incentivize HTTP/1.1" 302 4622
197.109.77.178 - kertzmann3129 [21/Jun/2019:15:45:25 -0700] "DELETE /virtual/solutions/target/web+services HTTP/2.0" 203 26554
156.127.178.177 - okuneva5222 [21/Jun/2019:15:45:27 -0700] "DELETE /interactive/transparent/niches/revolutionize HTTP/1.1" 416 14701"""
item = re.findall("(?P<user_name>[a-zA-Z]+[0-9]+)",text )
item
如何将两个捕获组组合成一个 findall (或 finditer) 调用?
解决方案
加入群组.*?
:
(?P<host>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}).*?(?P<user_name>[a-zA-Z]+[0-9]+)
见证明。
解释
--------------------------------------------------------------------------------
(?P<host> group and capture to \k<host>:
--------------------------------------------------------------------------------
\d{1,3} digits (0-9) (between 1 and 3 times
(matching the most amount possible))
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
\d{1,3} digits (0-9) (between 1 and 3 times
(matching the most amount possible))
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
\d{1,3} digits (0-9) (between 1 and 3 times
(matching the most amount possible))
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
\d{1,3} digits (0-9) (between 1 and 3 times
(matching the most amount possible))
--------------------------------------------------------------------------------
) end of \k<host>
--------------------------------------------------------------------------------
.*? any character except \n (0 or more times
(matching the least amount possible))
--------------------------------------------------------------------------------
(?P<user_name> group and capture to \k<user_name>:
--------------------------------------------------------------------------------
[a-zA-Z]+ any character of: 'a' to 'z', 'A' to 'Z'
(1 or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
[0-9]+ any character of: '0' to '9' (1 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \k<user_name>
推荐阅读
- reporting-services - SSRS 错误 - 所有聚合都需要范围 在数据之外使用
- javascript - 调用函数删除传单上的图层时没有发生任何事情
- mysql - 时间戳类型的Mysql数据插入问题
- angular6 - 如何通过另一个组件的按钮进行切换?
- java - Spring Security 获取自定义标头
- android-studio - 丢失 Android Studio 密钥库
- java - 如何从 NetCDF 文件中获取恒定时间和深度的单层纬度和经度?
- json - 解析 url 期间的 Http 失败
- oracle - 在 plsql 中使用表单数据上传后文件损坏
- android - 更改 android 中编辑文本的 addTextChangedListener 侦听器中的文本会出错