python - 从 python 字典更新文本文件
问题描述
您好社区成员,
假设我在 python 中有一本字典:
dict = {'fresh air', 'entertainment system', 'ice cream', 'milk', 'dog', 'blood pressure'}
以及如下文本列表:
text_file = ['is vitamin d in milk enough', 'try to improve quality level by automatic intake of fresh air', 'turn on the tv or entertainment system based on that individual preferences', 'blood pressure monitor', 'I buy more ice cream', 'proper method to add frozen wild blueberries in ice cream']
我想显示每个出现的短语都属于字典(比如新鲜空气),就像#fresh_air#
在文本文件的所有出现中一样,而对于字典的每个单词(比如milk
),输出应该显示为#milk#
,即在开头附加特殊字符并以所有出现的 text_file 结尾。
我想要的输出应采用以下形式(列表列表):
[[is vitamin d in #milk# enough], [try to improve quality level by automatic intake of #fresh_air#], [turn on the tv or #entertainment_system# based on the individual preferences], [#blood_pressure# monitor], [I buy more #ice_cream#], [proper method to add frozen wild blueberries in #ice_cream# with #milk#]]
是否存在任何标准方法可以以省时的方式实现这一目标?
我是使用 python 进行列表和文本处理的新手,我尝试使用列表理解,但未能达到预期的结果。任何帮助都深表感谢。
解决方案
使用正则表达式。
前任:
import re
data = {'fresh air', 'entertainment system', 'ice cream', 'milk', 'dog', 'blood pressure'}
pattern = re.compile("("+"|".join(data)+")")
text_file = ['is vitamin d in milk enough', 'try to improve quality level by automatic intake of fresh air', 'turn on the tv or entertainment system based on that individual preferences', 'blood pressure monitor', 'I buy more ice cream', 'proper method to add frozen wild blueberries in ice cream']
result = [pattern.sub(r"#\1#", i) for i in text_file]
print(result)
输出:
['is vitamin d in #milk# enough',
'try to improve quality level by automatic intake of #fresh air#',
'turn on the tv or #entertainment system# based on that individual preferences',
'#blood pressure# monitor',
'I buy more #ice cream#',
'proper method to add frozen wild blueberries in #ice cream#']
请注意,您的dict
变量是一个set
对象。
根据评论中的要求更新了片段。
演示:
import re
data = {'fresh air', 'entertainment system', 'ice cream', 'milk', 'dog', 'blood pressure'}
data = {i: i.replace(" ", "_") for i in data}
#pattern = re.compile("("+"|".join(data)+")")
pattern = re.compile(r"\b("+"|".join(data)+r")\b")
text_file = ['is vitamin d in milk enough', 'try to improve quality level by automatic intake of fresh air', 'turn on the tv or entertainment system based on that individual preferences', 'blood pressure monitor', 'I buy more ice cream', 'proper method to add frozen wild blueberries in ice cream']
result = [pattern.sub(lambda x: "#{}#".format(data[x.group()]), i) for i in text_file]
print(result)
输出:
['is vitamin d in #milk# enough',
'try to improve quality level by automatic intake of #fresh_air#',
'turn on the tv or #entertainment_system# based on that individual preferences',
'#blood_pressure# monitor',
'I buy more #ice_cream#',
'proper method to add frozen wild blueberries in #ice_cream#']
推荐阅读
- multithreading - 如何确保一个同步块在另一个之后执行
- ios - GLSL 到金属。瓦
- javascript - 使用 express 从节点服务器向客户端 JavaScript 发送数据时出现错误
- dart - `Resource` 库不适用于 `dart2native`
- c# - 回发后如何保持单选按钮的填充
- python - csv 文件中可能的空值单元格
- c++ - 如何检查第三个 API 是否为 linux 中的 c/c++ 程序创建了一个新线程?
- android - 如何使用没有任何背景的 AlertDialog 在屏幕中央显示自定义布局
- jquery - 在将单选按钮值映射到 jquery 中 json 数组中的值时需要帮助
- python - 如果它们都具有相同的标签名称,我如何在 Python 中读取多个 XML 文件?