首页 > 解决方案 > 从 pandas 数据框创建嵌套字典

问题描述

我有一个数据框,它演示了仪表的层次结构。一个仪表有一个ID,可以有任意数量的孩子,这个孩子也可以有孩子,也可以有孩子,无穷无尽。

数据框每行有一米,子级按列显示。如下所示:

层次表

目的是将其转换为以下格式的嵌套字典:

{
    "meters": [
        {
            "meter_id": "a",
            "meter_children": [
                {
                    "meter_id": "b",
                    "meter_children": []
                },
                {
                    "meter_id": "c",
                    "meter_children": [
                        {
                            "meter_id": "d",
                            "meter_children": []
                        }
                    ]
                },
                {
                    "meter_id": "e",
                    "meter_children": []
                }
            ]
        },
        {
            "meter_id": "f",
            "meter_children": []
        },
        {
            "meter_id": "g",
            "meter_children": []
        },
        {
            "meter_id": "h",
            "meter_children": []
        },
        {
            "meter_id": "i",
            "meter_children": []
        },
        {
            "meter_id": "j",
            "meter_children": []
        },
        {
            "meter_id": "k",
            "meter_children": []
        },
        {
            "meter_id": "l",
            "meter_children": [
                {
                    "meter_id": "m",
                    "meter_children": []
                },
                {
                    "meter_id": "n",
                    "meter_children": []
                },
                {
                    "meter_id": "o",
                    "meter_children": []
                }
            ]
        },
        {
            "meter_id": "p",
            "meter_children": []
        },
        {
            "meter_id": "q",
            "meter_children": []
        },
        {
            "meter_id": "r",
            "meter_children": []
        },
        {
            "meter_id": "s",
            "meter_children": []
        },
        {
            "meter_id": "t",
            "meter_children": []
        },
        {
            "meter_id": "u",
            "meter_children": []
        }
    ]
}

我已经设法实现了这一点,使用您可以在下面看到的可怕代码(抱歉)。我想知道是否有一个工具可以为你做到这一点,或者是否有一种更清洁、更易读的方法来完成这个。

请注意,这只会上升到 4 的嵌套级别,但可以轻松地进一步扩展。

results = {}
list_0 = []

for row in df.values:
    
    counter = 0
    
    for entry in row:
        
        if entry==entry:
            
            entry=str(entry)
        
            if counter==0:
                
                list_0.append({
                    "meter_id":entry,
                    "meter_children":[]
                })
                meter_0 = entry
                
                list_1 = []
                
            if counter==1:
                            
                for item in list_0:
                    
                    if meter_0 in item.values():
                        
                        list_1.append({
                            "meter_id":entry,
                            "meter_children":[]
                        })
                        item["meter_children"]=list_1
    
                        meter_1=entry
                        
            
                list_2=[]
                
            if counter==2:
                
                for item in list_0:
                    
                    if meter_0 in item.values():
                        
                        for item in list_1:
                            
                            if meter_1 in item.values():
                                
                                list_2.append({
                                    "meter_id":entry,
                                    "meter_children":[]
                                })
                                item["meter_children"]=list_2
                                
                                meter_3=entry
                                 
                list_3=[]
                                    
            if counter==3:
                
                for item in list_0:
                    
                    if meter_0 in item.values():
                        
                        for item in list_1:
                            
                            if meter_1 in item.values():
                                
                                for item in list_2:
                                    
                                    if meter_2 in item.values():
                                        
                                        list_3.append({
                                            "meter_id":entry,
                                            "meter_children":[]
                                        })
                                        item["meter_children"]=list_3

                                        meter_4=entry
                                        
                list_4=[]
                
        counter+=1
                
results["meters"] = list_0

标签: pythonjsonpandasdictionary

解决方案


您当然可以改进您的代码以使其更高效,但据我所知,您的问题对于通用解决方案来说太具体了,抱歉......

为了改进您的代码并将其推广到多个(未知)级别,我看到了两种解决方案:

  • n编写一个递归函数,用 level做你想做的事情n+1
  • 编写一个while循环,通过使用的内容逐行构建您的字典df.iterrows()

推荐阅读