首页 > 解决方案 > 如何使用split方法从python中的文本文件中制作一袋单词

问题描述

我正在尝试学习 TFIDF。但我无法从文件中提取单词。

代码:

docA = open("/home/user/Desktop/da/doca","r")
print(docA.read())
bowA = docA.split(" ")

错误:

AttributeError                            
Traceback (most recent call last)
<ipython-input-32-06e07f9dd975> in <module>
----> 1 bowA = docA.split(" ")

AttributeError: '_io.TextIOWrapper' object has no attribute 'split'`
Can anyone help me solve this?

标签: pythontf-idf

解决方案


我假设你的意思是:

docA = open("/home/user/Desktop/da/doca","r")
# print(docA.read())
bowA = docA.read().split(" ") # or just split() will do
docA.close()

当您调用read()读取光标时,读取整个文件,将读取光标留在末尾。所以再次调用read()将返回空字符串。因此,如果您想打印内容,可以将内容分配给一个变量,打印它并根据需要使用它:

docA = open("/home/user/Desktop/da/doca","r")
data = docA.read()
print(data)
bowA = data.split()
docA.close()

或者干脆

with open("/home/user/Desktop/da/doca","r") as docA:
    data = docA.read()
print(data)
bowA = data.split()

推荐阅读