首页 > 解决方案 > 如何仅将引号添加到感兴趣的子字符串?

问题描述

我编写了一个解析器,它能够从字符串中获取信息。我不知道如何在感兴趣的子字符串周围添加引号。让我举例说明:

我收到了这条消息:

message = 'I have two variables: -mass: 12 --vel= 18 OR this is just another descriptor AND that new thing OR that newfangled thing' 

我需要在特定子字符串(遵循布尔运算符)周围添加引号,如下所示:

message = 'I have two variables: -mass: 12 --vel= 18 OR "this is just another descriptor" AND "that new thing" OR "that newfangled thing"' 

我已经这样做了:

attributes = ['OR', 'AND', 'NOT']
message = 'I have two variables: -mass: 12 --vel= 18 OR this is just another descriptor AND that new thing OR that new fangled thing'
for attribute in attributes:
        modified_attribute = ' '+attribute+' '
        message = modified_attribute.join('"{}"'.format(s.strip()) for s in message.split(attribute))
        if attributes.index(attribute)>0: message = message[1:-1]

print(message)

但是,它返回了这个,这不是我想要的:

"I have two variables: -mass: 12 --vel= 18" OR "this is just another descriptor" AND "that new thing" OR "that new fangled thing"

上面的第一句应该有引号,因为它前面没有布尔运算符。我该怎么办?

编辑:我正在寻找一种可扩展的解决方案,可用于引用字符串中任意数量的子字符串。

标签: pythonstring

解决方案


您可以使用带有前瞻的正则表达式,如下所示:

import re
message = re.sub(r'(\b(?:{0})\b) (.*?)(?=\s*\b(?:{0}|$)\b)'.format('|'.join(map(re.escape, attributes))), r'\1 "\2"', message)

message会变成:

I have two variables: -mass: 12 --vel= 18 OR "this is just another descriptor" AND "that new thing" OR "that new fangled thing"

推荐阅读