python - How to create DF from vocabulary using not standard delimiters?
问题描述
I tried to count words frequency via vocabulary:
vocabulary = {}
for word in lemmatizer_results:
if word in vocabulary:
vocabulary[word] += 1
else:
vocabulary[word] = 1
after this I tried to convert the results to DataFrame via:
df = pd.DataFrame.from_dict(vocabulary, orient='index', columns=['word', 'frequency'])
It would have worked if the structure of the dictionary was like:
vocabulary = {'word1': [3],
'word2': [34]}
but I have structure like this:
vocabulary = {'three': 1622,
'elephant': 66,
'power': 1070,
'story': 667,
'b': 65,
'paterson': 1,}
Can you help me with creation DF from these data? Thank you!
解决方案
你很亲密。使用orient='index'
,字典键转换为数据帧索引,而值转换为数据。所以你可以重命名你的索引,然后重置它。
df = pd.DataFrame.from_dict(vocabulary, orient='index', columns=['frequency'])\
.rename_axis('word').reset_index()
print(df)
word frequency
0 three 1622
1 elephant 66
2 power 1070
3 story 667
4 b 65
5 paterson 1
推荐阅读
- javascript - Javascript视差在Mac上滞后
- python - 获取每个类别的颜色
- c++ - 如何从调试符号中排除外部依赖项?
- gpu - PyTorch:多 GPU 错误:RuntimeError:binary_op():预期两个输入都在同一设备上,但输入 a 在 cuda:0 上,输入 b 在 cuda:7 上
- ffmpeg - ffmpeg 提供文本和文本文件。请只提供一个
- angular - 2 个观察者,并且必须根据值返回其中之一
- apache-spark - 对于时间序列汇总/聚合,流处理是否优于批处理?
- domino-appdev-pack - 在快速入门和 bulkReadDocuments 的示例代码中获取“UnhandledPromiseRejectionWarning: Error”
- excel - 通过 VBA 将坐标从 excel 导入 Autodesk Inventor
- java - 发送多个请求以获得最终响应或保留第一个请求?