python-3.x - 为什么我不能更换新的行分隔符?
问题描述
我正在开发将消息从应用程序发送到我们的 API 的 Python 电报客户端,我想排除一些单词。在这种情况下应该删除一些@logins 和#tag:
这是我的代码:
for w in app.config['EXCLUDED_WORDS']:
if w in data:
data = data.replace(w, '')
很简单,对吧?我得到的结果(很多新行):
我尝试了非常不同的 NL 分隔符,例如,#YoCrypto\n #YoCrypto\r #YoCrypto\r\n
但它不起作用。所以这是我的print(data.encode('utf-8'))
输出:
#TAG\n#YoCrypto\xd0\xa0laced \xd0\xb0dditional signal for Bitmex. I will remember to include both exchanges on the same signal for btcusd now on. My apologies for inconvenience.\xef\xbb\xbf@grandcchat\n@grandcsign\n@grandcmargin
我究竟做错了什么?
UPD 01.01.2020
有一些排除的词:['@grandcmargin\n', '@grandcsign\n', '@grandcchat\n', '#YoCrypto\n', 'По всем вопросам (For all questions, please contact): @NickolchenkoGCS']
我们应该在替换区域的开始和结束时留下一个中断,所以预期的输出应该是这样的:
#TAG\n\nPlaced additional signal for Bitmex. I will remember to include both exchanges on the same signal for btcusd now on. My apologies for inconvenience.\n\n[Picture from message]
解决方案
一种可能的解决方案是使用re
模块并将单词加上任何其他换行符替换为空字符串。例如:
import re
data = b'''#TAG\n#YoCrypto\xd0\xa0laced \xd0\xb0dditional signal for Bitmex. I will remember to include both exchanges on the same signal for btcusd now on. My apologies for inconvenience.\xef\xbb\xbf@grandcchat\n@grandcsign\n@grandcmargin'''
words_to_remove = {'@grandcmargin', '@grandcsign', '@grandcchat', '#YoCrypto', 'По всем вопросам (For all questions, please contact): @NickolchenkoGCS'}
# decode the data (if not decoded already)
data = data.decode('utf-8')
# replace the words plus any aditional new-line character afterwards:
data = re.sub('|'.join(r'(?:[\ufeff]*{}\n*)'.format(re.escape(w)) for w in words_to_remove) , '\n', data)
data = re.sub(r'\n{3,}', r'\n\n', data) # remove excessive new-lines
print(data)
印刷:
#TAG
Рlaced аdditional signal for Bitmex. I will remember to include both exchanges on the same signal for btcusd now on. My apologies for inconvenience.
推荐阅读
- python - Python - 冒泡排序错误索引超出范围
- javascript - 如何在 Node.js 中使用 arguments 关键字?
- mysql - SQL,查找两个给定名称在一列中是否具有相同数字的查询
- php - 如何不重复基于 JOIN 查询的表列?
- c# - C# 对象建模 1:n:m...
- vb.net - 我怎样才能使这项工作?单击按钮时,我无法添加文本框值,然后在 textbox3 上显示
- node.js - 如果语句不能正常工作 discord.js
- html - 如何在运行时传递可由 html 代码选择的参数
- java - Thymeleaf 和 SpringBoot 的 NoSuchMessageException
- swift - 楼梯问题 Swift 打印“#” n 和 n-1 次