python - Regex similar to a Hearst Pattern in Python
问题描述
I'm trying to come up with a regex similiar to the ones listed here for Hearst Patterns in order to get the following results:
NP_The_Eleventh_Air_Force is NP_a_Numbered_Air_Force of NP_the_United_States_Air_Force_Pacific_Air_Forces (NP_PACAF).
NP_The_Eleventh_Air_Force (NP_11_AF) is NP_a_Numbered_Air_Force of NP_the_United_States_Air_Force_Pacific_Air_Forces (NP_PACAF).
Doing re.search(regex, sentence)
for each of this sentences I want to match this 2 groupsNP_The_Eleventh_Air_Force NP_a_Numbered_Air_Force
This is my attempt but it doesn't get any matches:
(NP_\\w+ (, )?is (NP_\\w+ ?))
解决方案
In both sentences I think (, )?
is not present, but the part before between parenthesis is so you could make that part optional instead.
Also move the last parenthesis from ))
to (NP_\w+)
to create the first group.
The pattern including the optional comma and space could be:
(NP_\w+)(?: \([^()]+\))? (?:, )?is (NP_\w+ ?)
If you don't need the space at the end and the comma space is not present, you pattern could be:
(NP_\w+)(?: \([^()]+\))? is (NP_\w+)
(NP_\w+)
Capture group 1 Match NP_ and 1+ word chars(?: \([^()]+\))?
Optionally match a space and a part with parenthesisis
Match literally(NP_\w+)
Capture group 2 Match NP_ and 1+ word chars
See a regex demo | Python demo
For example
import re
regex = r"(NP_\w+)(?: \([^()]+\))? is (NP_\w+)"
test_str = "NP_The_Eleventh_Air_Force is NP_a_Numbered_Air_Force of NP_the_United_States_Air_Force_Pacific_Air_Forces (NP_PACAF)."
matches = re.search(regex, test_str)
if matches:
print(matches.group(1))
print(matches.group(2))
Output
NP_The_Eleventh_Air_Force
NP_a_Numbered_Air_Force
推荐阅读
- c++ - 使用 LuaBridge 将 LuaJIT 绑定到 C++ 会导致“PANIC: unprotected error”
- sql - PL/SQL 中的“使用”查询?
- linux - NASM x86_64 删除换行符并在字符串末尾添加 0
- javascript - 将整个表单 css 变灰
- c# - 从“System.String”到“Serilog.Core.IDestructuringPolicy”的无效转换
- typescript - TS2339:“窗口”类型上不存在属性“*”。仅在 WebStorm
- python - 如何使用 Python REST API 在 Azure DevOps 中检索测试结果?
- netbeans-platform - JProfiler 与 Netbeans 平台应用程序
- python - 获取需要 gtk 才能工作的旧 python 脚本
- python - 复制站点包下的包以在其他机器上使用是否安全?