python - 获取 'package' 和 'endpackage' 可选字符串之外的结构名称列表
问题描述
我正在尝试获取外部package
和endpackage
可选字符串的结构名称。如果没有package
和endpackage
字符串,则脚本应返回所有结构名称。
这是我的脚本:
import re
a = """
package new;
typedef struct packed
{
logic a;
logic b;
} abc_y;
typedef struct packed
{
logic a;
logic b;
} abc_t;
endpackage
typedef struct packed
{
logic a;
logic b;
} abc_x;
"""
print(re.findall(r'(?!package)*.*?typedef\s+struct\s+packed\s*{.*?}\s*(\w+);.*?(?!endpackage)*', a, re.MULTILINE|re.DOTALL))
这是输出:
['abc_y', 'abc_t', 'abc_x']
预期输出:
['abc_x']
我在正则表达式中遗漏了一些东西,但不知道是什么。有人可以帮我解决这个问题吗?提前致谢。
解决方案
利用
\bpackage.*?\bendpackage\b|typedef\s+struct\s+packed\s*{[^{}]*}\s*(\w+);
请参阅正则表达式证明。
解释
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
--------------------------------------------------------------------------------
package 'package'
--------------------------------------------------------------------------------
.*? any character except \n (0 or more times
(matching the least amount possible))
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
--------------------------------------------------------------------------------
endpackage 'endpackage'
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
typedef 'typedef'
--------------------------------------------------------------------------------
\s+ whitespace (\n, \r, \t, \f, and " ") (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
struct 'struct'
--------------------------------------------------------------------------------
\s+ whitespace (\n, \r, \t, \f, and " ") (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
packed 'packed'
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
{ '{'
--------------------------------------------------------------------------------
[^{}]* any character except: '{', '}' (0 or more
times (matching the most amount possible))
--------------------------------------------------------------------------------
} '}'
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
; ';'
蟒蛇代码:
print(list(filter(None,re.findall(r'\bpackage.*?\bendpackage\b|typedef\s+struct\s+packed\s*{[^{}]*}\s*(\w+);', a, re.DOTALL))))
结果:['abc_x']
推荐阅读
- c - 如何按字母顺序对句子中的每个单词进行排序?
- node.js - RxJS:我如何将 bindNodeCallback() 视为可以将运算符链接到它的 Observable?
- java - Java android动画隐藏和取消隐藏视图和调整按钮
- javascript - 基于Typescript中嵌套对象数组的值进行过滤
- html - 为什么冻结行标题时 td 不填充父 tr?
- django - Django:从模型字段中获取选择的查询集
- reactjs - 在 componentDidMount 中设置的 Mobx 可观察属性未在 textarea 中呈现
- css - Bootstrap 行没有用汉堡包覆盖宽度
- java - spring jmsListener 监听多个队列
- sql-server - 在 Sqoop 中排除列