首页 > 解决方案 > 按日期拆分 CSV 或 XLSX 并设置最大文件大小

问题描述

为了通过设置近似文件大小(基于最大行)来拆分文件,我使用以下内容:

import pandas as pd
import numpy as np
import csv
import sys
import os

def split(filehandler, delimiter=',', row_limit=70000,
          output_name_template='LARGE EXCEL/OUTPUT/output_%s.csv', output_path='.', keep_headers=True):
    import csv
    reader = csv.reader(filehandler, delimiter=delimiter)
    current_piece = 1
    current_out_path = os.path.join(
        output_path,
        output_name_template % current_piece
    )
    current_out_writer = csv.writer(open(current_out_path, 'w', newline=''), delimiter=delimiter)
    current_limit = row_limit
    if keep_headers:
        headers = next(reader)
        current_out_writer.writerow(headers)
    for i, row in enumerate(reader):
        if i + 1 > current_limit:
            current_piece += 1
            current_limit = row_limit * current_piece
            current_out_path = os.path.join(
                output_path,
                output_name_template % current_piece
            )
            current_out_writer = csv.writer(open(current_out_path, 'w', newline=''), delimiter=delimiter)
            if keep_headers:
                current_out_writer.writerow(headers)
        current_out_writer.writerow(row)

split(open('LARGE EXCEL/FILE.csv', 'r'));

我想更改上述内容以使每个文件都有唯一的日期(一天的行应该只显示在 1 个文件中)。除非,我想不出完全不同的解决方案。

标签: pythonsplit

解决方案


推荐阅读