python - 如何将python列表项与正则表达式匹配
问题描述
import re
def popular_words(text, words):
"""(str, array) -> dictionary
returns dictionary search words are the keys and values
are the number of times when those words are occurring
in a given text
"""
word_dictionary = {}
for word in words:
list = re.findall(word, text, re.IGNORECASE)
word_dictionary.update({word : len(list) })
return word_dictionary
popular_words('''
When I was One
I had just begun
When I was Two
I was nearly new
''', ['i', 'was', 'three', 'near'])
如何忽略文本字符串中的“near”而不匹配“nearly”我尝试使用 \bword\b 来定义单词边界,错误是:
“行继续字符后的意外字符”
解决方案
您绝对可以使用字符串格式和 \b。您遇到的错误可能是因为您没有使用这样的原始字符串(如果您使用反斜杠,请始终使用带有 re 的原始字符串,这会让生活更轻松。):
import re
def popular_words(text, words):
"""(str, array) -> dictionary
returns dictionary search words are the keys and values
are the number of times when those words are occurring
in a given text
"""
word_dictionary = {}
for word in words:
list = re.findall(r'\b{0}\b'.format(word), text, re.IGNORECASE)
word_dictionary.update({word : len(list) })
return word_dictionary
print(popular_words('''
When I was One
I had just begun
When I was Two
I was nearly new
''', ['i', 'was', 'three', 'near']))
输出:
{'i': 4, 'near': 0, 'was': 3, 'three': 0}
编辑:为了完整起见。这是不使用原始字符串所必须使用的。你必须通过加倍来逃避反斜杠。
list = re.findall('\\b{0}\\b'.format(word), text, re.IGNORECASE)
推荐阅读
- oracle-apex - Oracle APEX 中的交互式网格 - 如果我选择“全选”,则只选择 40 行
- javascript - JavaScript中的getElementsByTagName方法---不起作用
- machine-learning - Gensim LDA 提供主题 ID 的输出,但概率加起来不等于 1
- react-leaflet - react-leaflet-control - TypeError: (0 , _reactLeaflet.withLeaflet) 不是函数
- reactjs - 获取 URL 缺少 OData 查询
- c# - 找出角度差异
- c++ - arduino 压力传感器跳过旋律 C++ 中的音符
- batch-file - 最后一行加粗 - 使用批处理文件
- javascript - Mongodb聚合查询返回未定义
- python - Python“请求”仅返回“响应 [403]”