python - 如何使类中的函数每行多次删除单词
问题描述
下面的代码应该清除 frack 这个词,并且可能是一个坏词列表。但目前问题在于功能clean_line
。如果文本行有两次以上的压裂,它只取第一个,也不会对大写字母做出反应。
class Cleaner:
def __init__(self, forbidden_word = "frack"):
""" Set the forbidden word """
self.word = forbidden_word
def clean_line(self, line):
"""Clean up a single string, replacing the forbidden word by *beep!*"""
found = line.find(self.word)
if found != -1:
return line[:found] + "*beep!*" + line[found+len(self.word):]
return line
def clean(self, text):
for i in range(len(text)):
text[i] = self.clean_line(text[i])
example_text = [
"What the frack! I am not going",
"to honour that question with a response.",
"In fact, I think you should",
"get the fracking frack out of here!",
"Frack you!"
]
clean_text = Cleaner().clean(example_text)
for line in example_text: print(line)
解决方案
假设您只想删除其中的任何单词frack
,您可以执行以下代码之类的操作。如果您还需要摆脱尾随空格,那么您将需要稍微更改正则表达式。如果您需要了解有关正则表达式的更多信息,我建议您查看regexone.com。
# Using regular expressions makes string manipulation easier
import re
example_text = [
"What the frack! I am not going",
"to honour that question with a response.",
"In fact, I think you should",
"get the fracking frack out of here!",
"Frack you!"
]
# The pattern below gets rid of all words which start with 'frack'
filter = re.compile(r'frack\w*', re.IGNORECASE)
# We then apply this filter to each element in the example_text list
clean = [filter.sub("", e) for e in example_text]
print(clean)
输出
['What the ! I am not going',
'to honour that question with a response.',
'In fact, I think you should',
'get the out of here!',
' you!']
推荐阅读
- networking - 通过同一区域的 2 VPS 的私有 IP 进行通信是否比通过其公共 IP 进行的通信更快?
- angular - 只让 bootstrap css 在一个模块中工作
- spring-boot - 在没有 ServletContext 类型的 bean 的情况下使用 DataJpaTest 进行测试
- ios - 使用背景图片截取屏幕截图 - iOS
- python - python try...除了处理来自用户输入的参数
- javascript - 为什么我不能在我的 JS 计算中添加一个以上的点
- java - 计算列高的范围
- qt - 在 Ubuntu 18.04.1 LTS 中安装 Qt 4.8.7
- node.js - 谁能解释是什么问题?
- mongodb - 即使未启用授权,MongoDB 授权错误