python - Levenshtein 函数查找最接近的名称
问题描述
我需要以下代码的帮助。在这种情况下,我需要找到与输入单词最接近的单词来测试我将 word_0 设置为“pikaru”,它应该返回“pikachu”。levenshtein 函数返回我们输入的两个单词之间的距离。当我运行下面的代码时,我得到的答案是charmander,这是很远的,任何帮助将不胜感激。
import backend
name_to_stats, id_to_name, names,
pokemon_by_typebackend.get_pokemon_stats()
words = names
word_0 = 'pikaru'
def find_closest_word(word_0, words):
"""Finds the closest word in the list to word_0 as measured by the
Levenshtein distance
Args:
word_0: a str
words: a list of str
Returns:
The closest word in words to word_0 as a str.
"""
# Hint: use the levenshtein_distance() function to help you out here.
closest_word = words[0]
#closest_distance = levenshtein_distance(word_0, words[0])
for i in words:
distance = levenshtein_distance(word_0, closest_word)
new_distance = levenshtein_distance(word_0, i)
if distance < new_distance:
return i
def levenshtein_distance(s1, s2):
"""Returns the Levenshtein distance between strs s1 and s2
Args:
s1: a str
s2: a str
"""
# This function has already been implemented for you.
# Source of the implementation:
# https://stackoverflow.com/questions/2460177/edit-distance-in-python
# If you'd like to know more about this algorithm, you can study it in
# CSCC73 Algorithms. It applies an advanced technique called dynamic
# programming.
# For more information:
# https://en.wikipedia.org/wiki/Levenshtein_distance
# https://en.wikipedia.org/wiki/Dynamic_programming
if len(s1) > len(s2):
s1, s2 = s2, s1
distances = range(len(s1) + 1)
for i2, c2 in enumerate(s2):
distances_ = [i2+1]
for i1, c1 in enumerate(s1):
if c1 == c2:
distances_.append(distances[i1])
else:
distances_.append(1 + min((distances[i1], distances[i1 + 1],
distances_[-1])))
distances = distances_
return distances[-1]
解决方案
看起来错误出现在您的函数return
语句中:find_closest_word
if distance < new_distance:
return i
该函数不会找到最近的单词,它实际上会找到列表中比. words[0]
相反,请尝试循环words
并跟踪哪个单词是您迄今为止见过的最好的单词。就像是:
best_distance = levenshtein_distance(word_0, words[0])
best_word = words[0]
for w in words:
d = levenshtein_distance(word_0, w)
if d < best_distance:
best_distance = d
best_word = w
return best_word
推荐阅读
- arrays - 随机范围是什么意思?
- java - JAVA - 无法从另一个数组正确填充多维数组
- javascript - 生成带有可在角度编辑的输入字段的pdf
- linux - 使用 SoC(ARM、Xilinx)的 Linux 和 RTOS
- javafx - 我在场景构建器中收到此错误:com.gluonhq.charm.glisten.control.Icon
- java - 以 X 时间差运行 2 种方法?
- angular - 我在 Angular 5 项目中的 IE 浏览器上的图标有问题
- java - 使用 ssh 和 ssl 的 java mongodb 连接
- sql - “在这种情况下不允许子查询。只允许标量表达式。”
- javascript - 为什么javascript构造函数打印传递给它的参数?