首页 > 解决方案 > 将文件列表转换为树状字典

问题描述

假设我有一个如下所示的列表:

list_all_files = [['folder1', 'subfolder1', 'file1'], 
                  ['folder1', 'subfolder1', 'file2'],
                  ['folder1', 'subfolder1', 'file3'],
                  ['folder1', 'subfolder1', 'file4'],
                  ['folder1', 'subfolder2', 'file1'],
                  ['folder1', 'subfolder2', 'file2'],
                  ['folder2', 'subfolder1', 'file1'],
                  ['folder2', 'subfolder1', 'file2'],
                  ['folder3', 'file1'],
                  ['folder3', 'file2'],
                  ['folder4', 'subfolder1', 'file1'],
                  ['folder4', 'subfolder1', 'file2'],
                  ['folder2', 'subfolder2', 'file1'],
                  ['folder2', 'subfolder2', 'file2'],
                  ['folder2', 'subfolder2', 'file3'],
                  ['folder2', 'subfolder2', 'file4']]

“list_all_files”只是一个示例 - 该列表还可以有零个或 n 个文件夹和/或子文件夹和/或文件。如何将其转换为如下所示的字典?

dict_all_files =

{    'folder1': {'subfolder1': {'file1', 'file2', 'file3', 'file4'},
                 'subfolder2': {'file1', 'file2'}},
     'folder2': {'subfolder1': {'file1', 'file2'},
                 'subfolder2': {'file1', 'file2', 'file3', 'file4'}},
     'folder3': {'file1', 'file2'},
     'folder4': {'subfolder1': {'file1', 'file2'}}    }

我尝试遍历列表并使用 dict.update(),如下所示:

dict_all_files = {}
for member in list_all_files:
    if member[0] == 'folder1':
        dict_all_files.update({'folder1': ''})
        for element in member:
            if member[1] == 'subfolder1':
                dict_all_files.update({folder1': member[1]})

但是随后我会覆盖文件夹,而且我必须手动为每个文件夹和子文件夹编写 if 语句,这不是很实用。因此,处理我的代码毫无意义,因为它已经存在缺陷。也许我从一开始就想错了?如果有人可以提供答案或至少提供提示,那就太好了。我还没有找到任何问题来回答这个或类似的问题。

标签: python

解决方案


您可以使用它dict.setdefault来清理您的代码。

import pprint
list_all_files = [['folder1', 'subfolder1', 'file1'], 
                  ['folder1', 'subfolder1', 'file2'],
                  ['folder1', 'subfolder1', 'file3'],
                  ['folder1', 'subfolder1', 'file4'],
                  ['folder1', 'subfolder2', 'file1'],
                  ['folder1', 'subfolder2', 'file2'],
                  ['folder2', 'subfolder1', 'file1'],
                  ['folder2', 'subfolder1', 'file2'],
                  ['folder3', 'file1'],
                  ['folder3', 'file2'],
                  ['folder4', 'subfolder1', 'file1'],
                  ['folder4', 'subfolder1', 'file2'],
                  ['folder2', 'subfolder2', 'file1'],
                  ['folder2', 'subfolder2', 'file2'],
                  ['folder2', 'subfolder2', 'file3'],
                  ['folder2', 'subfolder2', 'file4']]

result = {}
for path in list_all_files:
    head = result
    for name in path[:-2]:
        head = head.setdefault(name,{})
    head.setdefault(path[-2],set()).add(path[-1])

pprint.pprint(result)

输出

{'folder1': {'subfolder1': set(['file1', 'file2', 'file3', 'file4']),
             'subfolder2': set(['file1', 'file2'])},
 'folder2': {'subfolder1': set(['file1', 'file2']),
             'subfolder2': set(['file1', 'file2', 'file3', 'file4'])},
 'folder3': set(['file1', 'file2']),
 'folder4': {'subfolder1': set(['file1', 'file2'])}}

推荐阅读