首页 > 解决方案 > 如何遍历字典并制作二维数组?

问题描述

所以,我有一本这样的字典:

dic_parsed_sentences = {'religion': {'david': 1, 'joslin': 1, 'apolog': 5, 'jim': 1, 'meritt': 2}, 
 'sport': {'sari': 1, 'basebal': 1, 'kolang': 5, 'footbal': 1, 'baba': 2},
 'education': {'madrese': 1, 'kelas': 1, 'yahyah': 5, 'dars': 1},
 'computer': {'net': 1, 'internet': 1},
 'windows': {'copy': 1, 'right': 1}}

我想根据该字典中字典的长度循环遍历它。

例如,
它有两个长度为 5 的项目,一个长度为 4 的项目,以及两个长度为 2 的项目。我想一起处理相同长度的项目(类似于 pandas 中的 group by)。
所以第一次迭代的输出看起来这样(你看到这里只有长度为 5 的项目可用):

[[david, joslin, apolog, jim, meritt],
 [sari, baseball, kolang, footbal, baba]]

并且下一次迭代它将制作下一个相同长度的项目:

[[madrese, kelas, yahyah, dars]]

最后一次迭代

[[net, internet],
 [copy, right]]

为什么我们这里只有三个迭代?因为我们在字典中只有三个不同长度的项目dic_parsed_sentences。我做过这样的事情,但我不知道如何遍历相同长度的项目:

for i in dic_parsed_sentences.groupby(dic_parsed_sentences.same_length_items): # this line is sodoku line I dont know how to code it(I mean iterate through same length items in the dicts)
    for index_file in dic_parsed_sentences:
        temp_sentence = dic_parsed_sentences[index_file]
        keys_words = list(temp_sentence.keys())
        for index_word in range(len(keys_words)):
            arr_sent_wids[index_sentence, index_word] = 
                                keys_words[index_word]
    index = index + 1
    index_sentence = index_sentence + 1

更新:

for length, dics in itertools.groupby(dic_parsed_sentences, len):
    for index_file in dics:
        temp_sentence = dics[index_file]
        keys_words = list(temp_sentence.keys())
        for index_word in range(len(keys_words)):
                test_sent_wids[index_sentence, index_word] = lookup_word2id(keys_words[index_word])
        index = index + 1
        index_sentence = index_sentence + 1

标签: pythonarraysdictionarymultidimensional-array

解决方案


您可以itertools.groupby在按长度对字典元素进行排序后使用。

import itertools
items = sorted(dic_parsed_sentences.values(), key = len, reverse = True)
for length, dics in itertools.groupby(items, len):
    # dics is all the nested dictionaries with this length
    for temp_sentence in dics:
        keys_words = list(temp_sentence.keys())
        for index_word in range(len(keys_words)):
                test_sent_wids[index_sentence, index_word] = lookup_word2id(keys_words[index_word])
        index = index + 1
        index_sentence = index_sentence + 1     

推荐阅读