首页 > 解决方案 > 查找句子中是否存在关键短语的子串

问题描述

我正在尝试检查给定句子中是否存在密钥。

但是在句子中,key 可以是杂乱无章的,或者中间可以有特殊字符,或者key 之间可以有一些单词。现在我不担心区分大小写。

例如,我想检查是否:

我开始从键到语句逐个字符比较,但我无法解决问题。

根据评论,我厌倦了想出代码。

def getMatch(text,key):
matchedwords=list()
i=0
for j in range(0, len(text)):
    for i in range(0,len(key)):
        match = ""
        keyindex=i
        textindex=j
        while(keyindex<len(key) and textindex<len(text) and key[keyindex]==text[textindex] ):
            match+=text[textindex]
            keyindex+=1
            textindex+=1
        if(len(match)>0):
            if(match not in matchedwords):
                matchedwords.append(match)
print(matchedwords)

text="MOTOR Device 101 is high"
key="MOTOR101"
getMatch(text,key)

我能够得到输出为“['MOTOR', 'OTOR', 'O', 'TOR', 'OR', 'R', '101', '1', '01']”。如果需要任何更改或可以进行改进,请告诉我。从这里我在这里尝试检查是否有任何单词组合导致“MOTOR101”。

标签: python

解决方案


您正在字符串中搜索两个单独的项目。如果将它们拆分为列表,则以下代码将找到它们。


lk = [ 'Motor','101' ]
s1 = [ 'Device Motor_101 is very high',
       '101Motor Device is very high',
       'Motor device 101 is very is high' ]

for i, a in enumerate( s1 ):
    result = -1
    for b in a.split():
        if lk[0] in b:
            result = i
        if lk[1] in b:
            if result > -1:
                print( f'found in {a}' )

此代码将搜索并找到两个以上的项目


lk = [ 'Motor','101' ]
ln = len( lk )
s1 = [ 'Device Motor_101 is very high',
       '101Motor Device is very high',
       'Motor device 101 is very is high' ]

found = []
for i, a in enumerate( s1 ):
    for b in a.split():
        result = -1
        for gp in range( ln ):
            if lk[ gp ] in b:
                result += 1
        if result > -1 and found.count( a ) == 0:
            found.append( a )
            print( f'found in {a}' )
print( f'{found}' )

好的,我发现单词重复会导致误报,所以……这是我的第三次尝试


lk = [ 'sky', '101', 'Motor', 'very' ]
ln = len( lk )
s1 = [ 'Device Motor_101 is very high in sky',
       '101Motor Device is sky sky very',
       'Motor device is very is high very very' ]

found = []
for i, a in enumerate( s1 ):
    result = 0
    sa = a.split( )
    for b in a.split():
        for gp in range( ln ):
            if lk[ gp ] in b:
                if b in sa:
                    sa.remove( b )
                    result += 1
    if result == ln-1:
        found.append( a )

print( f'{found}' )

推荐阅读