首页 > 解决方案 > 查找一串随机字符中是否存在英文单词

问题描述

我正在尝试计算一个英文单词存在于可变长度字符串中的概率;假设 10 个字符。我有打印可变长度随机字符的代码,但我不知道如何检查是否存在英文单词。

我不需要检查特定的单词——我需要检查这个可变长度的字符串中是否存在任何英文单词。

我有两个问题——要么我如何为一个 10 个字符的字符串执行此操作,要么,这也很有帮助,如何为任意长度的字符串执行此操作。

随机字符的代码是:

def infmonktyp():
  out = " "
  count = 0
  length = int(input("How many characters do you want to print? "))
  for i in range(1, length+1):
    num = randint(1,26)
    out += switcher.get(num, "0")
  print(out)

switcher 是一个字典,其中包含分别与 AZ 配对的数字 1-26。

如果我的输入是 10,那么字符串可能类似于“BFGEHDUEND”,输出应该是字符串“BFGEHDUEND”和 True,因为该字符串包含一个英文单词(“END”)。

标签: pythonpython-3.xstring

解决方案


我想我可能会为您提供一个解决方案,它不仅适用于英语,而且适用于其他语言(如果 NLTK 支持)。

我们将使用 NLTK 来获取一组所有英语单词(在此处记录,第 4.1 节。),我们将其分配给english

然后,我们循环遍历变量out并在所有可能的位置对其进行切片,最小长度为 2 个字母,并将结果附加到一个名为 的新列表中all_variants

最后,我们遍历 'words' in all_variants,检查它们是否在我们的变量中english并适当地打印响应。

# imports 
import nltk
import string
import random

# getting the alphabet
alph = [x for x in string.ascii_lowercase]
# creating your dictionary
switcher = {}
for i in range(1, 27):
    switcher[i] = alph[i-1]
# using nltk we are going to get a set of all english words
english = set(w.lower() for w in nltk.corpus.words.words())


def infmonktyp(english_dict = english, letter_dictionary = switcher): 
    out = "" 
    count = 0 
    length = int(input("How many characters do you want to print?"))
    if length < 2:
        raise ValueError("Length must be greater than 1")
    for i in range(1, length+1): 
        num = random.randint(1,26) 
        out += letter_dictionary.get(num, "0") 
    # the random word has been created
    print(out)
    all_variants = []
    # getting all variants of the word, minimum of 2 letters
    for i in range(len(out)-1):
        for j in range(i+2, len(out)+1):
            all_variants.append(out[i:j])
    # for know how many words we found, im gussing thats what you have in the second line?
    words_found = 0
    # looping through all the words, if they exist in english, print them, if not keep going
    for word in all_variants:
        if word in english_dict:
            print(word, ' found in ', out)
            words_found += 1
    # if we didnt find any words, print that we didnt find any words
    if words_found == 0:
        print("Couldn't find a word")

# initialising function
infmonktyp(english, switcher)

推荐阅读