python - 识别列表项是否在字符串中
问题描述
我正在尝试创建一个嵌套循环序列,它查看一系列停用词和字符串列表,并确定每个停用词是否在每个列表项中。理想情况下,我希望能够将每个字符串中存在的单词添加到新列中,并将它们全部从字符串中删除。
有人有提示吗?我的循环顺序错误吗?
def remove_stops(text, customStops):
"""
Removes custom stopwords.
Parameters
----------
text : the variable storing strings from which
stopwords should be removed. This can be a string
or a pandas DataFrame.
customStops : the list of stopwords which should be removed.
Returns
-------
Cleansed lists.
"""
for item in text:
print("Text:", item)
for word in customStops:
print("Custom Stops: ", word)
if word in item:
print("Word: ", word)
#Add word to list of words in item
#Remove word from item
解决方案
这是您可以执行的操作:
def remove_stops(text, customStops):
found = {k:[] for k in text} # Dict for all found stopwords in text
for i,item in enumerate(text):
for word in customStops:
text[i] = text[i].replace(word,'') # Remove all stopwords from each string, if the stopword is not in, the replace will just leave it as it is
if word in item:
found[item].append(word)
return text, found
text = ['Today is my lucky day!',
'Tomorrow is rainy',
'Please help!',
'I want to fly']
customStops = ['help', 'fly']
clean, found = remove_stops(text, customStops)
print(clean)
print(found)
输出:
['Today is my lucky day!',
'Tomorrow is rainy',
'Please !',
'I want to ']
{'Today is my lucky day!': [],
'Tomorrow is rainy': [],
'Please help!': ['help'],
'I want to fly': ['fly']}
推荐阅读
- c# - 如何以编程方式将 .MDF 文件附加到 Azure SQL Server c#
- sql - 使用复杂的 IIF 条件语句时出现表达式太复杂的错误 - Access SQL
- c# - LINQ 连接谓词 OR
- eclipse - 如何将 Eclipse 从 2018-09 更新到 2018-12
- pandas - 熊猫数据框合并的问题
- c - 为什么我的 Xcode 找不到我创建的 .txt 文件?
- javascript - 重新捕获下拉值而不刷新页面
- linux - Linux perf_events 注释帧指针混淆
- ansible - Ansible:断开连接时如何将 --extra-vars 维护到两个子剧本中?
- html - ionic3 - 单击外部离子选择 div 标签时触发离子选择