首页 > 解决方案 > 如何实现一个高效的 WordFinder?

问题描述

我在一次采访中被问到以下问题。

class WordFinder:
    def init(words):
    def find(characters):

给定上面的类(伪代码,适合您选择的语言),其中words是一组用于初始化类的单词,并且characters是字符列表,返回words可以由 . 的任何子集形成的所有单词characters。如果需要,您可以进行一些预处理。

例子:

词= {“木头”,“词”,“词”}

字符= ['o','w','o','s','d','a','r']

输出:{ "wood", "word", "words" }

我在Python中实现如下:

from typing import Set, Sequence
import collections

class WordFinder:
    def __init__(self, words: Set[str]):
        self._words = { word: collections.Counter(word) for word in words }

    def find(self, characters: Sequence[str]) -> Sequence[str]:
        x = collections.Counter(characters)
        return [word for word, c in self._words.items() if not (c - x)]

words = {"wood", "word", "words"}
wf = WordFinder(words)
actual = wf.find(['o', 'w', 'o', 's', 'd', 'a', 'r'])
assert set(actual) == words

这可行,但该find方法每次都会遍历所有单词。有没有更好的办法?可以进行一些预处理的声明似乎是我没有接受的暗示。

Python 或 Java 实现是可以接受的。

免责声明:我发现以前也有人问过类似的问题,但没有答案被接受,而且没有一个答案比我做的更好。

标签: pythonjavaalgorithmdata-structures

解决方案


在java中尝试实现。这避免了扫描整个单词表,因此其复杂性仅限于字符长度。

import java.util.*;
import java.util.stream.Stream;

public class WordFinder {
class TrieNode {
    boolean isEnd;
    TrieNode[] children;
    String word;

    TrieNode() {
        isEnd = false;
        children = new TrieNode[26];
    }
}

TrieNode root;

public void init(String[] words) {
    root = new TrieNode();
    for (String w : words) {
        insert(w);
    }
}

private void insert(String w) {
    TrieNode curr = root;

    for (int i = 0; i < w.length(); ++i) {
        char c = w.charAt(i);

        if (curr.children[c - 'a'] == null)
            curr.children[c - 'a'] = new TrieNode();

        curr = curr.children[c - 'a'];
    }

    curr.isEnd = true;
    curr.word=w;
}


public String[] find(char[] chars) {

    Map<Character, Integer> mp = new HashMap<>();
    for (int i = 0; i < chars.length; ++i) {
        char c = chars[i];
        mp.put(c, mp.getOrDefault(c, 0) + 1);
    }
    Set<String> set  = new HashSet<>();
    dfs(root, mp, 0,set);
    return set.toArray(new String[0]);
}

private void dfs(TrieNode curr, Map<Character, Integer> mp, int l,Set<String> set) {
    //base
    if (curr.isEnd) {
        set.add(curr.word);
    }

    //logic
    for (int i = 0; i < 26; ++i) {
        char c = (char) (i + 'a');

        if (curr.children[i] != null) {
            if (mp.containsKey(c) && mp.get(c) > 0) {
                mp.put(c, mp.get(c) - 1);
                dfs(curr.children[i], mp, l + 1,set);
                mp.put(c, mp.get(c) + 1);
            }
        }
    }
}
public static void main(String[] args){
    WordFinder wordFinder=new WordFinder();
    String[] words = { "wood", "word", "words" };
    char[] chars={ 'o', 'w', 'o', 's', 'd', 'a', 'r' };
    wordFinder.init(words);
    String[] res = wordFinder.find(chars);
    System.out.println(Arrays.toString(res));

}

}

输出:[单词,木头,单词]


推荐阅读