python - regex works fine on the limited input but script hangs when whole input file is given
问题描述
I am writing a script to parse the desired text with regex from different blocks of input, but somehow the regex doesn't handle the whole input file correctly and script hangs. Can someone please help me fix the issue in the regex?
This is my script:
import re
a = """
abc # (.C (1),
.H (1)
)
xyz [M-1:0]
(.a (a),
.y (y),
.e (e)
);
por
chk [N-1:0] (/*AUTOINST*/
// Outputs
elem line[E-1:0] (.en (en [E-1:0]),
generate
for(i = 0) begin: check
check
#(
.F (1::t),
.C (1),
.H (1)
)
data_check
(// Outputs
except_check
#(
.a (m),
.b (w),
.e (1)
)
data_check
(// Outputs
block1
#(/*AUTOINSTPARAM*/
// Parameters
.THREE (3), // comment
.TWO (2), // comment
.ONE (1)) // comment
inst1
(/*AUTOINST*/
// extra
// output
block2
#(/*AUTOINSTPARAM*/
// Parameters
.THREE (3), // comment
.TWO (2), // comment
.ONE (1)) // comment
inst2
(/*AUTOINST*/
// extra
// output
"""
op = re.findall(r'^\s*(\w+)\s*\n*(?:\s*[^\w\s].*\n*)*\s*(\w+)\s*(?:\[.*\])*\s*\(', a, re.MULTILINE)
for i in op:
print(i)
This is the output:
('abc', 'xyz')
('por', 'chk')
('elem', 'line')
('generate', 'for')
('check', 'data_check')
('except_check', 'data_check')
('block1', 'inst1')
('block2', 'inst2')
Now if I add following lines at the end of the input a
in script, then the script just hangs and I need to kill it with control+c.
a = """
abc # (.C (1),
.H (1)
)
< copy same as above and add following at the end >
output [`X-1:0] o, // o
////////////////////////////////////
"""
After I kill, I see this log:
^CTraceback (most recent call last):
File "1.py", line 66, in <module>
op = re.findall(r'^\s*(\w+)\s*\n*(?:\s*[^\w\s].*\n*)*\s*(\w+)\s*(?:\[.*\])*\s*\(', a, re.MULTILINE)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 181, in findall
return _compile(pattern, flags).findall(string)
KeyboardInterrupt
I am not sure what is so special in those 2 lines. Cannot figure out how to handle this scenario in regex as the whole input file can have that type of lines which are not being taken care of. There are .*
in the regex, that might be creating the problem, but not sure. It will be great if someone can help me fixing it.
解决方案
推荐阅读
- android - RxJava 过滤器不发出结果来订阅
- mysql - 我需要一些关于查询的解释,mysql
- java - 如何更改cardview的背景颜色并保存更改?
- javascript - 对象作为 React 子对象无效(找到:[object Promise])
- powershell - New-Mailbox 命令不接受 -Equipment 参数
- sql - 如何找到特定部门员工的最高工资
- sql - 如何使用sqlpackage.exe生成重命名表的脚本?
- javascript - Node.js:使用数学方法而不参考模块
- php - 如何在没有 Google 帐户的情况下通过权限 ID 打开或生成 Google 文档的共享链接?
- c# - 当返回类型为 IHttpActionResult 时,Web API 2 返回不带引号的简单字符串