python - regex, on match strip and capture?
问题描述
I have a working block of code, but something tells me it's not the most efficient.
- start with a few strings
- if it has DBA or ATTN followed by at least any 2 characters, capture DBA or ATTN to the end of line, don't look at the next string
- strip out what was just captured
What I have below seems to do that just fine.
import re
alt_name = ""
name1 = "JUST A NAME"
name2 = "UNITED STATES STORE DBA USA INC"
name3 = "ANOTHER FIELD"
regex = re.compile(r"\b(DBA\b.{2,})|\b(ATTN\b.{2,})")
if re.search(regex, name1):
match = re.search(regex, name1)
alt_name = match.group(0)
name1 = re.sub(regex, "", name1)
elif re.search(regex, name2):
match = re.search(regex, name2)
alt_name = match.group(0)
name2 = re.sub(regex, "", name2)
elif re.search(regex, name3):
match3 = re.search(regex, name3)
alt_name = match.group(0)
name3 = re.sub(regex, "", name3)
print(name1)
print(name2)
print(name3)
print(alt_name)
Is there a way to capture and strip with just 1 line instead of searching, matching and then subbing? I'm looking for efficiency and readability. Just making it short to be clever isn't what I'm going for. Maybe this is just the way to do it?
解决方案
您可以使用方法作为替换参数,re.sub
将匹配的文本保存到变量中,如果要删除找到的匹配项,只需返回并空字符串。
但是,您必须重写您的模式以提高效率:
r"\s*\b(?:DBA|ATTN)\b.{2,}"
请参阅正则表达式演示。
\s*
- 0+ 空白字符\b
- 单词边界(?:DBA|ATTN)
- aDBA
或ATTN
子字符串\b
- 单词边界.{2,}
- 2 个或更多除 LF 符号之外的字符,尽可能多。
这是一个例子:
import re
class RegexMatcher:
val = ''
rx = re.compile(r"\s*\b(?:DBA|ATTN)\b.{2,}")
def runsub(self, m):
self.val = m.group(0).lstrip()
return ""
def process(self, s):
return self.rx.sub(self.runsub, s)
rm = RegexMatcher()
name = "UNITED STATES STORE DBA USA INC"
print(rm.process(name))
print(rm.val)
请参阅Python 演示。
也许创建一个列表变量更有意义val
,然后.append(m.group(0).lstrip())
.
推荐阅读
- python - 内存错误 - 如果为 Null Pandas,则右加入
- android - 如何将 Firebase Admin SDK 添加到 Google Cloud Platform 并与应用程序通信?
- whitespace - 是否有任何编程语言为尾随空格分配含义?
- eclipse - 如何从命令行在当前打开的 Eclipse 工作区中打开文件?
- python - Python使用循环打开和保存多个文件
- json - Postman 将 Json 转换为 x-www-form-urlencoded
- angular - 将 asp.net 核心应用程序连接到 windows 服务器域中的网络驱动器
- php - 如何在 Yii2 中通过 queryBuilder 构建这个 sql 查询?
- java - 如何使用 java 中的 mimemessage 类获取电子邮件正文和附件
- react-native - React-Native 自动向下滚动