首页 > 解决方案 > Python:将结构化 CSV 文件解析为 dict

问题描述

有什么想法可以解析多行结构的csv,如:

H3|509596|OUT|1653128|06/11/2018|
D1|1653128|1|390|MXT586|EA|EA|55.600|219.99|Product 1
D2|1653128|1|900|390|
T1|1653128|999|1000.000|
H3|509597|OUT|1653128|06/11/2018|
D1|1653128|1|390|MXT586|EA|EA|55.600|219.99|Product 2
D2|1653128|1|900|390|
D2|1653128|2|600|430|
T1|1653128|999|2164.000|

我想阅读内容并解析为 dict 列表,如下所示:

List of Dict: 
 [ (
     Header : (509596, 'OUT', 1653128, '06/11/2018')
     Items  : [ (1653128, 1, 390, 'MXT586', 'EA', 'EA', ....,
                  (1, 900, 390) ) ]
     Trailer: (1653128, 999, 1000)
    ), ...
 ]

标签: pythoncsv

解决方案


Python 的库可以通过将 a 指定为分隔符csv来读取该文件。|需要注意删除任何空的尾随条目,因为某些行|的末尾有 a。

import csv

def get_int(v):
    # Attempt to convert the value into an integer
    try:
        return int(v)
    except ValueError as e:
        return v    # Return the original value

filter_na = lambda row: tuple(get_int(v) for v in row[1:] if v)
data = []

with open('input.csv') as f_input:
    csv_input = csv.reader(f_input, delimiter='|')
    block = {}

    for row in csv_input:
        if row[0] == 'H3':
            block['Header'] = filter_na(row)
        elif row[0].startswith('D'):
            try:
                block['Items'].append(filter_na(row))
            except KeyError:
                block['Items'] = [filter_na(row)]
        elif row[0] == 'T1':
                block['Trailer'] = filter_na(row)
                data.append(block)

    print(data)

这将为您提供一个字典列表,如下所示:

[
    {
        'Header': (509597, 'OUT', 1653128, '06/11/2018'), 
        'Items': [(1653128, 1, 390, 'MXT586', 'EA', 'EA', '55.600', '219.99', 'Product 1'), (1653128, 1, 900, 390), (1653128, 1, 390, 'MXT586', 'EA', 'EA', '55.600', '219.99', 'Product 2'), (1653128, 1, 900, 390), (1653128, 2, 600, 430)], 
        'Trailer': (1653128, 999, '2164.000')
    }, 
    {
        'Header': (509597, 'OUT', 1653128, '06/11/2018'), 
        'Items': [(1653128, 1, 390, 'MXT586', 'EA', 'EA', '55.600', '219.99', 'Product 1'), (1653128, 1, 900, 390), (1653128, 1, 390, 'MXT586', 'EA', 'EA', '55.600', '219.99', 'Product 2'), (1653128, 1, 900, 390), (1653128, 2, 600, 430)], 
        'Trailer': (1653128, 999, '2164.000')
    }
]

推荐阅读