python - TypeError:需要一个类似字节的对象,而不是 pd.read_csv 的“str”
问题描述
我正在尝试来自该网站的代码:https ://datanice.wordpress.com/2015/09/09/sentiment-analysis-for-youtube-channels-with-nltk/
我遇到错误的代码是:
import nltk
from nltk.probability import *
from nltk.corpus import stopwords
import pandas as pd
all = pd.read_csv("comments.csv")
stop_eng = stopwords.words('english')
customstopwords =[]
tokens = []
sentences = []
tokenizedSentences =[]
for txt in all.text:
sentences.append(txt.lower())
tokenized = [t.lower().encode('utf-8').strip(":,.!?") for t in txt.split()]
tokens.extend(tokenized)
tokenizedSentences.append(tokenized)
hashtags = [w for w in tokens if w.startswith('#')]
ghashtags = [w for w in tokens if w.startswith('+')]
mentions = [w for w in tokens if w.startswith('@')]
links = [w for w in tokens if w.startswith('http') or w.startswith('www')]
filtered_tokens = [w for w in tokens if not w in stop_eng and not w in customstopwords and w.isalpha() and not len(w)<3 and not w in hashtags and not w in ghashtags and not w in links and not w in mentions]
fd = FreqDist(filtered_tokens)
这给了我以下错误:
tokenized = [t.lower().encode('utf-8').strip(":,.!?") for t in txt.split()]
TypeError: a bytes-like object is required, not 'str'
我正在使用以下代码获取 csv:
commentDataCsv = pd.DataFrame.from_dict(callFunction).to_csv("comments4.csv", encoding='utf-8')
我已经全部替换pd.read_json("comments.csv")
为read_csv
.
解决方案
在 Py3 中,默认的字符串类型是 unicode。 encode
将其转换为字节串。要应用于strip
字节串,您需要提供匹配的字符:
In [378]: u'one'.encode('utf-8')
Out[378]: b'one'
In [379]: 'one'.encode('utf-8').strip(':')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-379-98728e474af8> in <module>
----> 1 'one'.encode('utf-8').strip(':')
TypeError: a bytes-like object is required, not 'str'
In [381]: 'one:'.encode('utf-8').strip(b':')
Out[381]: b'one'
如果不先编码,可以使用默认的unicode字符
In [382]: 'one:'.strip(':')
Out[382]: 'one'
我建议走这条路,否则您的其余代码将需要b
令牌。
推荐阅读
- mysql - Django Google App Engine 服务器错误 500
- angular - 无法在 Angular 8 HttpClient 的帮助下接收 XML 响应
- spacy - Windows 10下安装spacy的问题
- php - 如何在php中的递归函数中返回数组
- r - 在 R 中有没有办法逃避替代()函数?
- postgresql - POSTGRES 检查提供的值是否存在于数组类型的列中
- android - 如果数量增加或减少,总体效果如何?
- javascript - 当单击外部/其他单选按钮时,Bootstrap 4 删除折叠
- javascript - 将逗号分隔值附加到 url 作为搜索参数
- php - 如何在 php 中强制重新连接到我的数据库?