首页 > 解决方案 > 如何在元数据之后读取 CSV?

问题描述

我有一个这样的 CSV 文件:

#Description
#Param1: value
#Param2: value
...
#ParamN: value

Time (s),Header1,Header2
243.41745,3,1
243.417455,3,5
243.41746,7,6
...

我需要使用 Python 阅读它,而不需要使用 Pandas。如何读取 CSV 数据本身忽略初始行直到空行?我正在使用下面的代码成功读取元数据。

def read(file_path: str):
    '''Read the data of the Digilent WaveForms Logic Analyzer Acquisition
    (moodel Discovery2).

    Parameter: File path.
    '''
    meta = {}
    RE_CONFIG = re.compile(r'^#(?P<name>[^:]+)(: *(?P<value>.+)\s*$)*')
    with open(file_path, 'r') as fh:
        # Read the metadata and description at the beginning of the file.
        for line in fh.readlines():
            line = line.strip()
            if not line:
                break
            config = RE_CONFIG.match(line)
            if config:
                if not config.group('value'):
                    meta.update({'Description': config.group('name')})
                else:
                    meta.update({config.group('name'): config.group('value')})
        # Read the data it self.
        data = csv.DictReader(fh, delimiter=',')
    return data, meta

标签: pythoncsv

解决方案


这似乎有效。我必须更改for line in fh.readlines():for line in fh:读取元数据的部分,以便不会读取与数据的行,然后创建DictReader并使用它来获取data.

import csv
from pprint import pprint, pp
import re

def read(file_path: str):
    '''Read the data of the Digilent WaveForms Logic Analyzer Acquisition
    (moodel Discovery2).

    Parameter: File path.
    '''
    meta = {}
    RE_CONFIG = re.compile(r'^#(?P<name>[^:]+)(: *(?P<value>.+)\s*$)*')
    with open(file_path, 'r') as fh:
        # Read the metadata and description at the beginning of the file.
        for line in fh:  # CHANGED
            line = line.strip()
            if not line:
                break
            config = RE_CONFIG.match(line)
            if config:
                if not config.group('value'):
                    meta.update({'Description': config.group('name')})
                else:
                    meta.update({config.group('name'): config.group('value')})

        # Read the data itself.
        reader = csv.DictReader(fh, delimiter=',')
        data = list(reader)

    return data, meta

res = read('mixed.csv')
pprint(res)

推荐阅读