首页 > 解决方案 > 将输入中的至少两个单词与语句匹配

问题描述

我正在努力编写一个匹配至少两个单词的正则表达式,以匹配 A 到 B。我刚刚找到了一种方法来排除输入 A 中的任何字典单词,所以在情况2中does没有问题。1 - A 应该匹配 B,假设 , 和 等词已经被删除。Wakandaexistdointhe

CASE 1
A -> Do Wakanda exist in the world?
B -> Does Wakanda exist?
>> A should match B

exclude = ['do', 'in', 'the']
A = "Do Wakanda exist in the world?"
B = "Does Wakanda exist?"
split_A = A.lower().split()
final_A = [i if i not in exclude else '' for i in split_A]
A = " ".join(' '.join(final_A).strip().split())

CASE 1
A -> wakanda exist world?
B -> Does Wakanda exist?
>> A should match B

CASE 2
A -> Does Atlantis exist in our world?
B -> Does Wakanda exist?
>> A should not match B

标签: regex

解决方案


您可以使用set操作来查看两个句子是否匹配(无需使用正则表达式,但您需要做一些预处理 - 删除?,将句子放入小写等):

A = "Do Wakanda exist in the world?"
B = "Does Wakanda exist?"

A2 = "Does Atlantis exist in our world?"
B2 = "Does Wakanda exist?"

exclude = ['do', 'in', 'the', 'does']

def a_match_b(a, b):
    a = set(a.replace('?', '').lower().split()) - set(exclude)
    b = set(b.replace('?', '').lower().split()) - set(exclude)
    return len(a.intersection(b)) > 1

print(a_match_b(A, B))
print(a_match_b(A2, B2))

输出是:

True
False

编辑:

正如@tobias_k 所说,您可以使用正则表达式来查找单词,因此您也可以使用:

import re

A = "Do Wakanda exist in the world?"
B = "Does Wakanda exist?"

A2 = "Does Atlantis exist in our world?"
B2 = "Does Wakanda exist?"

exclude = ['do', 'in', 'the', 'does']

def a_match_b(a, b):
    words_a = re.findall(r'[\w]+', a.lower())
    words_b = re.findall(r'[\w]+', b.lower())
    a = set(words_a) - set(exclude)
    b = set(words_b) - set(exclude)
    return len(a.intersection(b)) > 1

print(a_match_b(A, B))
print(a_match_b(A2, B2))

推荐阅读