python - Python 定位不重复的特定单词
问题描述
我有个问题。我正在尝试在字符串中查找设备名称。我要查找的所有设备名称都存储在一个列表中。我想要的有一件非常重要的事情:
- 一个命令可以有多个设备!!!
现在我遇到的问题是:
我有两个设备(Fan
和Fan Light
)。当我发出命令时:Turn on Fan Light
两个设备都已找到,但我只想Fan Light
找到。我尝试检查所有已找到的设备并将最长的设备设置为找到的设备,如下所示:
# Create 2 dummy devices
device1 = {
"name": "fan"
}
device2 = {
"name": "fan light"
}
# Add devices to list
devices = []
devices.append(device1)
devices.append(device2)
# Given command
command = "Turn on fan light"
foundDevices = []
# Search devices in sentence
for device in devices:
# Splits a device name if it has multiple words
deviceSplit = device["name"].split()
numOfSubNames = len(deviceSplit)
# Checks for every sub-name if it is found in the string
i = 0
for subName in deviceSplit:
if subName in command:
i += 1
# Checks if all names where located in string
if i == numOfSubNames:
foundDevices.append(device["name"])
# Checks if multiple devices have been found
if len(foundDevices) >= 2:
largestNameLength = 0
# Checks which device has the largest name
for device in foundDevices:
if (len(device) > largestNameLength):
largestName = device
largestNameLength = len(device)
# Clears list and only add longest one
foundDevices.clear()
foundDevices.append(largestName)
print(foundDevices)
但是当我说例如“打开风扇灯和风扇”时就会出现问题,因为该命令确实包含多个设备。如何以我想要的方式扫描设备?
解决方案
正则表达式搜索是一种快速执行所需操作的方法,其模式由不同的设备名称组成。
import re
def find_with_regex(command, pattern):
return list(set(re.findall(pattern, command, re.IGNORECASE)))
我还建议构建device: name
形状的反向字典,也许它有助于快速找到给定设备的代号。
devices = [{'name': 'fan light'}, {'name': 'fan'}]
# build a quick-reference dict with device>name structure
transformed = {dev: name for x in devices for name, dev in x.items()}
# should also help weeding out duplicated devices
# as it would raise an error as soon as it fids one
# print(transformed)
# {'fan light': 'name', 'fan': 'name'}
特别感谢buddemat指出设备名称按特定顺序排列以使该解决方案正常工作,并reversed(sorted(...
在下一个代码块的模式制作行上对其进行了修复。
测试功能
test_cases = [
'Turn on fan light',
'Turn on fan light and fan',
'Turn on fan and fan light',
'Turn on fan and fan',
]
pattern = '|'.join(reversed(sorted(transformed)))
for command in test_cases:
matches = find_with_regex(command, pattern)
print(matches)
输出
['fan light']
['fan', 'fan light']
['fan', 'fan light']
['fan']
推荐阅读
- r - 滚动计算 data.table 中的后续或先前值
- scala - 如何在 Scala 中使用过滤器搜索地图并更新值
- c# - 用于创建 Word 文档的线程同步
- c - 使用 C 中的合并算法对字符串矩阵进行排序(按字母顺序)
- powerbi - 将大型数据集从 SSAS 多维数据集导入 Power BI
- php - 分页显示 4 种方式,而不是只有 1 种(默认一种)
- python - 如何将 python 解释器嵌入到用 Python 编写的应用程序中?
- python - 将不同大小的 Numpy 数组相乘
- android - mapbox 注释 0.9.0 不适用于 mapbox 9.6.0
- java - 用 Java 流和 lambda 表达式替换 while 循环