首页 > 解决方案 > How to create DF from vocabulary using not standard delimiters?

问题描述

I tried to count words frequency via vocabulary:

vocabulary = {}

for word in lemmatizer_results:
  if word in vocabulary:
    vocabulary[word] += 1
  else:
    vocabulary[word] = 1

after this I tried to convert the results to DataFrame via:

df = pd.DataFrame.from_dict(vocabulary, orient='index', columns=['word', 'frequency'])

It would have worked if the structure of the dictionary was like:

vocabulary = {'word1': [3], 
              'word2': [34]}

but I have structure like this:

vocabulary = {'three': 1622,
 'elephant': 66,
 'power': 1070,
 'story': 667,
 'b': 65,
 'paterson': 1,}

Can you help me with creation DF from these data? Thank you!

标签: pythonpandasdictionarydataframe

解决方案


你很亲密。使用orient='index',字典键转换为数据帧索引,而值转换为数据。所以你可以重命名你的索引,然后重置它。

df = pd.DataFrame.from_dict(vocabulary, orient='index', columns=['frequency'])\
                 .rename_axis('word').reset_index()

print(df)

       word  frequency
0     three       1622
1  elephant         66
2     power       1070
3     story        667
4         b         65
5  paterson          1

推荐阅读