首页 > 解决方案 > 正则表达式加载整数个“键值”对,例如 3 key1“value1”key2“value2”key3“value3”

问题描述

我正在努力创建一个正则表达式来解析包含整数后面的整数个值的行,可以让它大部分工作,但不适用于整数为零且没有值跟随的情况。

例如

..... 2 "value1" "value2" "someother non-related text"
..... 0 "someother non-related text"

也是整数后的整数个空格分隔的键值对或

..... 3 key1 "value1" key2 "value2" key3 "value3"......

很高兴将它们填充到单个命名组中,但在以后将它们放在单独的命名组中可能会很有用。

3 "value1" "value2" "value3" "someother non-related text"

(?<my_named_group>([0])|[0-9] (?<my_values>(".*"?)?))

my_named_group = 3
my_values = '"value1" "value2" "value3"'

当整数为零时

my_named_group = 0
my_values = ""

对于第二个问题/正则表达式

3 key1 "value1" key2 "value2" key3 "value3" "someother non-related text"

my_named_group = 3
my_values = 'key 1 "value1" key 2 "value2" key3 "value3"'

标签: regex

解决方案


如果我理解正确,我们有数字后跟引号中的一些文本,我们可能会开始使用一个简单的表达式来解决它:

([0-9]+).+?(\".*\")

其中,所需的数字在第一个捕获组中([0-9]+),而另一个所需的子字符串在第二个捕获组中,(\".*\").

测试

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"([0-9]+).+?(\".*\")"

test_str = ("2 \"value1\" \"value2\" \"someother non-related text\"\n"
    "0 \"someother non-related text\"\n"
    "3 key1 \"value1\" key2 \"value2\" key3 \"value3\"")

subst = "\\1\\n\\2"

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

演示


推荐阅读