首页 > 解决方案 > 如何用适当的字典值替换句子中的字符串?

问题描述

我有一本字典如下:

dict_ = { 
        'the school in USA' : 'some_text_1',
        'school' : 'some_text_1',
        'the holy church in brisbane' : 'some_text_2',
        'holy church' : 'some_text_2'
}

和一个句子列表如下:

text_sent = ["Ram is going to the holy church in brisbane",\
             "John is going to holy church", \
             "shena is going to the school in USA", \
             "Jennifer is going to the school"]

我想用 text_sent 中的相应值替换出现的 dict_ 字典键。我这样做如下:

for ind, text in enumerate(text_sent) :
    for iterator in dict_.keys() :
        if iterator in text : 
            text_sent[ind] = re.sub(iterator, dict_[iterator], text)

for i in text_sent:
    print(i)

我得到的输出如下:

Ram is going to the some_text_2 in brisbane
John is going to some_text_2
shena is going to the some_text_1 in USA
Jennifer is going to the some_text_1

预期输出为:

Ram is going to some_text_2
John is going to some_text_2
shena is going to some_text_1
Jennifer is going to some_text_1

我需要的是,较长的字符串(例如,“布里斯班的圣堂”)需要更换,如果句子中没有完整的字符串,只有较小的版本(例如,在text_sent的句子中替换相应值时,应使用 ' Holy Church ') 而不是较长的那个。

标签: pythonstringlistdictionary

解决方案


您可以使用re.sub来进行替换,str.join用于格式化子字符串字典中的正则表达式:

import re
d = {'the school in USA': 'some_text_1', 'school': 'some_text_1', 'the holy church in brisbane': 'some_text_2', 'holy church': 'some_text_2'}
text_sent = ["Ram is going to the holy church in brisbane",\
         "John is going to holy church", \
         "shena is going to the School in USA", \
         "Jennifer is going to the school"]

r = [re.sub('|'.join(d), lambda x:d[x.group()], i, re.I) for i in text_sent]

输出:

['Ram is going to some_text_2', 'John is going to some_text_2', 'shena is going to some_text_1', 'Jennifer is going to the some_text_1']

推荐阅读