首页 > 解决方案 > 在python中将文本文件转换为嵌套的json

问题描述

我的数据如下所示。这是输入文件:

storeId,id,itemId,description
123,1,101,item_1
123,1,102,item_2
123,1,103,item_3
123,2,201,item_4
123,2,202,item_5

我想使用 Python 解析它并编写等效的 JSON,以便将信息重新写入以下格式的文件:

[{
    "storeId": 123,
    "itemType": [{
        "id": 1,
        "items": [{
            "itemId": 101,
            "description": "item_1"
        }, {
            "itemId": 102,
            "description": "item_2"
        }, {
            "itemId": 103,
            "description": "item_3"
        }]
    }, {
        "id": 2,
        "images": [{
            "itemId": 201,
            "description": "item_4"
        }, {
            "itemId": 202,
            "description": "item_5"
        }]
    }]
}]
    enter code here

我很困惑如何实现这一目标。任何人都可以帮助我吗?我对python很陌生。

标签: pythonjson

解决方案


您可以使用itertools.groupby

import itertools, csv
data = [[int(b) if b.isdigit() else b for b in i] for i in csv.reader(open('filename.csv'))][1:]
headers = ['storeId', 'itemType', 'id', 'items', 'itemId', 'description']
def create_structure(d, headers = headers):
  c = [[a, list(b)] for a, b in itertools.groupby(sorted(d, key=lambda x:x[0]), key=lambda x:x[0])]
  return [{headers[0]:a, headers[1]:create_structure([i for _, *i in b], headers = headers[2:]) if len(headers[2:]) > 2 else [dict(zip(headers[2:], i)) for _, *i in b]} for a, b in c]

import json
print(json.dumps(create_structure(data), indent = 4))

输出:

[
  {
    "storeId": 123,
    "itemType": [
        {
            "id": 1,
            "items": [
                {
                    "itemId": 101,
                    "description": "item_1"
                },
                {
                    "itemId": 102,
                    "description": "item_2"
                },
                {
                    "itemId": 103,
                    "description": "item_3"
                }
            ]
        },
        {
            "id": 2,
            "items": [
                {
                    "itemId": 201,
                    "description": "item_4"
                },
                {
                    "itemId": 202,
                    "description": "item_5"
                }
            ]
        }
     ]
   }
]

推荐阅读