python - python 中最常见的单词,运行时出现错误 (TypeError: unhashable type: 'list')
问题描述
我在下面写了代码,但在运行时出现错误(TypeError: unhashable type: 'list'),你能帮帮我吗?我想要我的令牌中最常用的单词。
! pip install wget
import wget
url = 'https://raw.githubusercontent.com/dirkhovy/NLPclass/master/data/moby_dick.txt'
wget.download(url, 'moby_dick.txt')
documents = [line.strip() for line in open('moby_dick.txt', encoding='utf8').readlines()]
import spacy
nlp = spacy.load('en')
tokens = [[token.text for token in nlp(sentence)] for sentence in documents[:200]]
from collections import Counter
# your code here
# Pass the split_it list to instance of Counter class.
Counter = Counter(tokens)
# most_common() produces k frequently encountered
# input values and their respective counts.
most_occur = Counter.most_common(10)
print(most_occur)
错误:TypeError Traceback (most recent call last) in () 4 # 将 split_it 列表传递给 Counter 类的实例。5 ----> 6 Counter = Counter(tokens) 7 8 # most_common() 产生 k 经常遇到
1 帧 /usr/lib/python3.6/collections/ init .py in update(*args, **kwds) 620 super(Counter, self).update(iterable) # 计数器为空时的快速路径 621 else: -- > 622 _count_elements(self, iterable) 623 if kwds: 624 self.update(kwds)
解决方案
! pip install wget
import wget
url = 'https://raw.githubusercontent.com/dirkhovy/NLPclass/master/data/moby_dick.txt'
wget.download(url, 'moby_dick.txt')
documents = [line.strip() for line in open('moby_dick.txt', encoding='utf8').readlines()]
import spacy
nlp = spacy.load('en')
tokens = [token.text for sentence in documents[:200] for token in nlp(sentence)]
from collections import Counter
Counter = Counter(tokens)
most_occur = Counter.most_common(10)
print(most_occur)
update your syntax of list comprehension
推荐阅读
- r - R 中的时间序列交叉验证:使用 tsCV() 和 tslm()-Multiple 模型
- mysql - CREATE TABLE 电影数据库的 MySQL 问题
- javascript - 从 Firebase 集合中查询数据
- python - 运行bash命令后如何将shell输出输入python
- python - DLL 加载失败:找不到指定的模块 PYTHON
- javascript - 承诺 {
} 使用异步等待 - c - 通过多个函数访问数组
- android - 我的自定义广播接收器不接收意图
- html - 当我在特定表格上滚动时,有什么方法可以找出我在哪个表格上?
- matlab - 在 N 个点上计算面积