首页 > 解决方案 > 使用列表生成 json 布局

问题描述

我正在尝试构建 JSON 布局。我正在从输入文件中读取所有这些记录。文件中可能有多个具有相同键(Id)的记录。

示例输入文件:

Id,LineNo,Amt,ReceivedDt,FromDt,ToDate,regionId
123545,1,1000.00,2019-02-01T00:00:00,2019-02-01T00:00:00,2019-02-01T00:00:00,WA12
123545,2,200.00,2019-02-01T00:00:00,2019-02-01T00:00:00,2019-02-01T00:00:00,WA12
123545,3,200.00,2019-02-01T00:00:00,2019-02-01T00:00:00,2019-02-01T00:00:00,WA12
123546,1,200.00,2019-02-01T00:00:00,2019-02-01T00:00:00,2019-02-01T00:00:00,WA13
123546,2,200.00,2019-02-01T00:00:00,2019-02-01T00:00:00,2019-02-01T00:00:00,WA13

我的逻辑是以字典格式从文件中读取记录并继续将其附加到列表中,直到相同的键(Id)匹配。如果键停止匹配,则删除列表并附加新键,然后将记录与此新键进行比较。在这两者之间,需要存储结果,这样我就不会丢失以前处理过的记录。(这是我无法弄清楚的)。

代码 :

import json,csv

with open('Test.csv') as f:
    inputfile = csv.DictReader(f)
    output = []
    key =1
    for row in inputfile :

        if len(output)==0:
            output.append(row)

        elif len(output)>0:
            if row['Id']==key:
                output.append(row)

            else:
                del output[:]
                output.append(row)
        key=row['Id']
        data = json.dumps({"data":output}, indent=4)

print(data)

输出: 只有最后 2 行来了,因为第一组被删除了。
请建议如何存储这些行。

{
    "data": [
        {
            "ToDate": "2019-02-01T00:00:00",
            "ReceivedDt": "2019-02-01T00:00:00",
            "regionId": "WA13",
            "Id": "123546",
            "LineNo": "1",
            "Amt": "200.00",
            "FromDt": "2019-02-01T00:00:00"
        },
        {
            "ToDate": "2019-02-01T00:00:00",
            "ReceivedDt": "2019-02-01T00:00:00",
            "regionId": "WA13",
            "Id": "123546",
            "LineNo": "2",
            "Amt": "200.00",
            "FromDt": "2019-02-01T00:00:00"
        }
    ]
}

期望的输出:

{
    "data": [
        {
            "ToDate": "2019-02-01T00:00:00",
            "ReceivedDt": "2019-02-01T00:00:00",
            "regionId": "WA12",
            "Id": "123545",
            "LineNo": "1",
            "Amt": "1000.00",
            "FromDt": "2019-02-01T00:00:00"
        },
        {
            "ToDate": "2019-02-01T00:00:00",
            "ReceivedDt": "2019-02-01T00:00:00",
            "regionId": "WA12",
            "Id": "123545",
            "LineNo": "2",
            "Amt": "200.00",
            "FromDt": "2019-02-01T00:00:00"
        },
        {
            "ToDate": "2019-02-01T00:00:00",
            "ReceivedDt": "2019-02-01T00:00:00",
            "regionId": "WA12",
            "Id": "123545",
            "LineNo": "3",
            "Amt": "200.00",
            "FromDt": "2019-02-01T00:00:00"
        }
    ]
},
{
    "data": [
        {
            "ToDate": "2019-02-01T00:00:00",
            "ReceivedDt": "2019-02-01T00:00:00",
            "regionId": "WA13",
            "Id": "123546",
            "LineNo": "1",
            "Amt": "200.00",
            "FromDt": "2019-02-01T00:00:00"
        },
        {
            "ToDate": "2019-02-01T00:00:00",
            "ReceivedDt": "2019-02-01T00:00:00",
            "regionId": "WA13",
            "Id": "123546",
            "LineNo": "2",
            "Amt": "200.00",
            "FromDt": "2019-02-01T00:00:00"
        }
    ]
}

标签: pythonjson

解决方案


使用itertools.groupby

import csv
import json
import itertools
import operator

with open('Test.csv') as f:
    cf = csv.DictReader(f)
    output = [{'data': list(rows)} 
        for id_, rows in itertools.groupby(cf, key=operator.itemgetter('Id'))]
data = json.dumps(output, indent=4)
print(data)

推荐阅读