python - 通过 Trie 搜索问题
问题描述
我编写了一个实现 Trie 数据结构的代码,它接受字符串列表和字符串的计数。
lst = [['james',9],['chloe',20],['chlara',30]]
字符串是名称,后面的整数值是计数。构建特里树后,它会提示用户输入前缀,
userinput = ch
这样,代码将返回字符串 chlara,因为与具有前缀 ch 的 chloe 相比,它的计数更高。我已经设法构建了 Trie,但我在搜索功能方面遇到了困难。
class Node:
def __init__(self):
self.children = [None] * 26
self.end = False
self.frequency = 0
self.goindex = 0
self.index = 0
class Trie:
def __init__(self):
self.root = Node()
def ord_char(self,key):
ord_rep = ord(key) - ord('a')
return ord_rep
def add_word(self,lst):
word = lst[0] #word
freq = lst[1] #frequency of string
word_len = len(word)
current = self.root #start from root node
for i in range(word_len):
position = self.ord_char(word[i])
if current.children[position] is None:
current.children[position] = Node()
current = current.children[position]
if current.frequency > freq:
continue
else:
current.frequency = freq
current.index = position
current.end = True #end of string
def main():
trie = Trie()
for i in list2:
trie.add_word(i)
user = input("Enter a prefix: ")
print(trie.prefix_search(user))
if __name__ == "__main__":
main()
我收到了不完整的字符串“chla”,我很确定这是因为我的搜索功能效率低下且无法正常工作。
更新
我现在面临的问题是,如果我的前缀是“aberr”,我将返回“aberration”而不是“aberr”
解决方案
你永远不会正确地遍历你的 trie。您有两个嵌套for
循环,因此仅从您的前缀中再遍历两个节点(字符)。
我将假设您想要返回一个结果,即使有多个字符串具有匹配的后缀和匹配的计数。
使用while
循环,并继续遵循count
最大值,直到您到达一个没有更多子节点的节点,其值等于当前节点的计数。请验证您end
对该节点是否为 True,因为这表明您的单词添加代码中存在错误:
def prefix_search(self, prefix):
# traverse the prefix
current = self.root
for c in prefix:
current = current.children[self.ord_char(c)]
if current is None:
return None # None is a much better return value than -1
while current is not None:
for child in current.children:
if child is None:
continue
if child.count == current.count:
# found first child with same score, continue from here
prefix += chr(child.index + 97)
current = child
break
else:
# no children with equal score, this is the longest match
assert current.end, "Data structure inconsistent"
return prefix
演示:
>>> trie.prefix_search('ch')
'chlara'
>>> trie.prefix_search('j')
'james'
以及其中一些极端情况:
>>> trie.add_word(('chlarissa', 9)) # longer word, lower count!
>>> trie.prefix_search('ch')
'chlara'
>>> trie.add_word(('janet', 6)) # same length and score as james, won't be found
>>> trie.prefix_search('j')
'james'
以及数据结构是否存在错误;这故意设置了错误的计数:
>>> trie.root.children[9].children[0].count = 7 # 'ja', 7, so higher, but not an end node
>>> trie.prefix_search('j')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 59, in prefix_search
AssertionError: Data structure inconsistent
推荐阅读
- ios - iOS ARKit - Calculate the degree of rotation of device in all three direction
- angular6 - How can we pass dynamic variable from any component.ts to style.less in angular 6?
- reactjs - 以这种方式使用 useEffect 有什么问题吗?
- javascript - 在数据表上显示列的总和(总计)并导出到 excel
- bash - 如何在字符串中的特定模式之后提取特定文本
- highcharts - Highstock - 软最小值和最大值不工作
- android - Android apk Build 输出文件夹中充斥着大量带有时间戳的 apk
- html - NativeScript WebView 未加载 html 文件
- android - 如何从 TextInputLayout 中删除白框
- python-3.x - 如何在 Spacy 中优先考虑基于规则的匹配而不是经过训练的 NER 模型?