首页 > 解决方案 > 在第一个列表中的单词中搜索一个字母并在 python 中的第二个列表中替换为一个真实的单词

问题描述

我想编写一个程序,它从输入中获取两个参数,第一个参数是一个句子(字符串),第二个参数是一个单词列表。在问题的一部分中,如果句子中的单词可以通过替换、添加或删除字符来匹配列表中的单词之一,则该单词必须替换列表中的正确单词。如果是单词中间的字母,我不知道怎么定义!

enter code heredef listToString(s):    
str1 = " "      
return (str1.join(s)) 
str_True=input("please enter true string :")
lst_True=[]
main_lst=[]
while True:
    word=input("please enter a word:")
    if word=="end":
       break
    else:
        lst_True.append(word)
lst_str=list(str_True.split(" "))
print(lst_str)
for i in range(len(lst_str)):
    for j in range(len(lst_True)):
        if (len(lst_str)) > 2 and lst_str[i]!=lst_True[j]:
           if lst_str[i]==lst_True[j][:-1] or lst_str[i]==lst_True[j][1:] or 
lst_True[j]==lst_str[i][:-1] or lst_True[j]==lst_str[i][1:] or lst_str[i]==lst_True[j] 
[:j]+lst_True[j][j+1:] or lst_True[j]==lst_str[i][:i]+lst_str[i][i+1:]:
               lst_str[i]=lst_True[j]                    
 print(lst_str)
 print(listToString(lst_str))    

标签: python

解决方案


你所说的是Levenshtein距离。Levenshtein 距离是单词之间的相似性度量,给定 2 个单词,它测量插入、删除或替换的数量,以将一个单词转换为另一个单词。

因此,您可以执行以下操作:

def levenshteinDistanceDP(token1, token2):
    distances = [[0]*(len(token2)+1) for _ in range(len(token1)+1)]

    for t1 in range(len(token1) + 1):
        distances[t1][0] = t1

    for t2 in range(len(token2) + 1):
        distances[0][t2] = t2

    a = 0
    b = 0
    c = 0

    for t1 in range(1, len(token1) + 1):
        for t2 in range(1, len(token2) + 1):
            if (token1[t1-1] == token2[t2-1]):
                distances[t1][t2] = distances[t1 - 1][t2 - 1]
            else:
                a = distances[t1][t2 - 1]
                b = distances[t1 - 1][t2]
                c = distances[t1 - 1][t2 - 1]

                if (a <= b and a <= c):
                    distances[t1][t2] = a + 1
                elif (b <= a and b <= c):
                    distances[t1][t2] = b + 1
                else:
                    distances[t1][t2] = c + 1

    return distances[len(token1)][len(token2)]

现在你的代码:

str_True = set(input("please enter true string :").split()) # We will use set for faster queries and to avoid duplicates.
lst_True = []

while True:
    word=input("please enter a word:")
    if word == "end":
        break
    else:
        lst_True.append(word)

original = lst_True.copy()


for i in range(len(str_True)):
    if not lst_True[i] in str_True:  # if the word is already on the set, it already meets the conditions.
        for word in str_True:
            if levenshteinDistanceDP(lst_True[i], word) <= 1:
                lst_True[i] = word

print("Original:", *original)                
print("Result:", *lst_True)

您可以在这里了解更多信息:https ://blog.paperspace.com/implementing-levenshtein-distance-word-autocomplete-autocorrect/


推荐阅读