首页 > 解决方案 > 离线词典程序:查找相似的单词以及开头相同的单词

问题描述

我写了这个离线字典程序。我希望当用户按下一个键时,这个程序进入数据库并找到一个接近用户输入的单词的单词。或者当用户完全输入一个单词并且该单词在数据库中时,程序会显示它的含义。

对于这一部分,一切都很顺利。然后我想例如当用户输入单词“a”时,程序会显示数据库中以“a”开头的所有单词。

这是我的问题的一个示例:当我们输入“a”时,应显示所有以“a”开头的单词和含义。但程序显示如下:

截屏

这是我的一些数据库json格式:

{"apple": ["Apple", "apple", "Sib", "Apfel", "Des pommes"], "average": ["Average", "average", "Miangin", "Durchschnitt", "Des pommes"], "acknowledge": ["Acknowledge", "acknowledge", "Tasdigh Kardan", "Zu bestatigen", "Pour reconnaître"], "book": ["Book", "book", "Ketab", "Buch", "Livre"], "banana": ["Banana", "banana", "Mouz", "Bananen", "Bananes"], "beach grass": ["Beach Grass", "beach grass", "Chamane Sahel", "Strandhafer", "herbe de plage"], "cat": ["Cat", "cat", "Gorbe", "Katzen", "chatte"], "certificate": ["Certificate", "certificate", "Govahi Name", "Zertifikat", "certificat"], "declaration of conformity": ["Declaration Of Conformity", "declaration of conformity", "Elamie Entebagh", "Konformitatserklarung", "déclaration de conformité"], "database": ["Database", "database", "Paygah Dade", "Datenbank", "base de données"], "dear colleagues": ["Dear Colleagues", "dear colleagues", "Hamkarane Aziz", "Liebe Mitarbeiterinnen und Mitarbeiter", "Chers collègues"]}

在这本词典中,每个单词都有英语、波斯语、法语和德语的含义。

你可以在下面看到我的代码:

import json
import msvcrt
import os 
from difflib import get_close_matches

DataBase = json.load(open("DataBase.json"))

def getMeaning(w):

    w = w.lower()
    n = len(w)

    if w in DataBase:
        return DataBase[w]

    elif len(get_close_matches(w,DataBase.keys(),1,0.3)) > 0:
        close_match = get_close_matches(w,DataBase.keys(),1,0.3)[0]
        print("Not Found!\nCheck The Close Match:\n")
        return DataBase[close_match]

    else:
        print ("Not Found!\n")
        res = [value for key, value in DataBase.items()]
        for i in res:
            for j in i:
                if w in j[0:n].lower(): 
                     print(j)
        return ''

word = '' 
while True:
    if msvcrt.kbhit():
        temp = msvcrt.getwch()
        word += temp
        os.system('cls')
        print(word)
        print("\n")
        meaning = getMeaning(word)
        for item in meaning:
            print(item)

请注意,您必须运行此程序CMD才能正常工作,因为msvcrt.kbhit().

标签: pythondictionarydifflib

解决方案


如果有人进入a,您正在呼叫getMeaning,而后者又呼叫get_close_matches。然后,您将检查该调用是否具有非零长度返回值,如果有,您是否有return DataBase[close_match]。这就是getMeaning结束的地方。

如果产生结果,您将永远无法到达else- 部分。在您问题的屏幕截图中,我们可以看到用户输入的结果,这很有意义,因为finds类似于.getMeaningget_close_matchesaget_close_matchescata

忽略这一点,startswith如果你想测试一个字符串是否以另一个字符串开头,你应该使用。此外,您不需要eliforelse在之前的ifor之后elif有 areturn并且我已经根据PEP 8 section Descriptive Naming Styles更改了名称。

这是一个可能的解决方案,使用仅在字母与 in 相同时才接受紧密匹配的过滤器word

from difflib import get_close_matches

database = {"apple": ["Apple", "apple", "Sib", "Apfel", "Des pommes"], "average": ["Average", "average", "Miangin", "Durchschnitt", "Des pommes"], "acknowledge": ["Acknowledge", "acknowledge", "Tasdigh Kardan", "Zu bestatigen", "Pour reconnaître"], "book": ["Book", "book", "Ketab", "Buch", "Livre"], "banana": ["Banana", "banana", "Mouz", "Bananen", "Bananes"], "beach grass": ["Beach Grass", "beach grass", "Chamane Sahel", "Strandhafer", "herbe de plage"], "cat": ["Cat", "cat", "Gorbe", "Katzen", "chatte"], "certificate": ["Certificate", "certificate", "Govahi Name", "Zertifikat", "certificat"], "declaration of conformity": ["Declaration Of Conformity", "declaration of conformity", "Elamie Entebagh", "Konformitatserklarung", "déclaration de conformité"], "database": ["Database", "database", "Paygah Dade", "Datenbank", "base de données"], "dear colleagues": ["Dear Colleagues", "dear colleagues", "Hamkarane Aziz", "Liebe Mitarbeiterinnen und Mitarbeiter", "Chers collègues"]}

def get_meaning(word):

    # Make word case-insensitive
    word = word.lower()

    # Check if word already in database
    if word in database:
        return {word: database[word]}

    # Find possible close matches
    close_matches = get_close_matches(word, database.keys(), 1, 0.3)
    # Filter matches: keep only those which contain the same letters
    close_matches = [
        close_match
        for close_match in close_matches
        if set(close_match) == set(word)
    ]
    # Return close matches if any left
    if close_matches:
        return {
            close_match: database[close_match]
            for close_match in close_matches
        }

    # Return all dictionary entries which start with the word
    return {
        entry: database[entry]
        for entry in database
        if entry.startswith(word)
    }

现在a不再生产cat

>>> get_meaning("a")
{'apple': ['Apple', 'apple', 'Sib', 'Apfel', 'Des pommes'], 'average': ['Average', 'average', 'Miangin', 'Durchschnitt', 'Des pommes'], 'acknowledge': ['Acknowledge', 'acknowledge', 'Tasdigh Kardan', 'Zu bestatigen', 'Pour reconnaître']}

applle仍然被认为是apple

>>> get_meaning("applle")
{'apple': ['Apple', 'apple', 'Sib', 'Apfel', 'Des pommes']}

或者,您可以修改调用的参数cutoffget_close_matches获得不同的结果。


推荐阅读