首页 > 解决方案 > 通过过滤字符清理字符串列表

问题描述

您好,我想清理一个包含成绩单的文本文件。

我复制并粘贴了一小部分文本文件。

['he looked in <the wellingtons> [//] the boots .\n',
'<last week> [//] one night a boy and a dog was [*] staring at a jar
,\n', 'at this thing inside , wondering what they can do with it the next
morning .\n',' at the same time <the fr(og)> [//] the thing jumped out that jar .']

保留前缀为“<”或“>”的单词。' 作为后缀,但这两个符号应该被删除

例如

he looked in <the wellingtons> [//] the boots . 应该改为 he looked in the wellingtons [//] the boots .

我还需要:

保留那些以' (' 作为前缀或')' 作为后缀的单词,但这两个符号也应该像 < 和 > 一样被删除,但是这次我需要保留三个符号作为例外 (.), (.. ), 和 (...)

我到目前为止的代码,但它真的不起作用

# prefix = '<'
# suffix = '>'
# clean = [' '.join(y for y in x.split(' ') if not (y[0] == '<' and y[-1] == '>') or y in {'<','>'}) for x in text]
# return clean 
# the text here refers to a list of strings 

# for the brackets '(' and ')'
# clean = [' '.join(y for y in x.replace('(','').replace(')','') if not (y[0] == '(' or y[-1] == ')') or y in {'(.)', '(..)', '(...)'}) for x in text]

我真的很想获得有关如何删除 '(' 和 ')' 的帮助,但不希望获得 '(.)'、'(..)'、'(...)' 的帮助

谢谢你

标签: pythonpython-3.xlistdata-cleaning

解决方案


尝试这个:

lst = ['he looked in <the wellingtons> [//] the boots .\n', '<last week> [//] one night a boy and a dog was [*] staring at a jar,\n', 'at this thing inside , wondering what they can do with it the next morning .\n', 'at the same time <the fr(og)> [//] the thing jumped out that jar .']

for i in lst:
    print(i.replace("<", "").replace(">", ""))

推荐阅读