首页 > 解决方案 > 如何使用基于日期的 Python 将数据导入多个 CSV 文件

问题描述

我创建了以下代码以从 PostgreSQL DB 导入 CSV 文件中的数据。但是,我想根据日期创建多个文件。

import psycopg2
import csv

conn_string = "host='' port='5432' user='' password='' dbname=''"

conn = psycopg2.connect(conn_string)

cur=conn.cursor()

query="select * from sample where date between '' and ''"

cur.execute(query)

title=[i[0] for i in cur.description]

result=cur.fetchall()

csvfile=open('filename.csv','w')

if result:
    c = csv.writer(csvfile)
    c.writerow(title)
    c.writerows(result)


cur.close()
conn.close()

文件应按以下格式拆分:

01jan.csv
02jan.csv 
etc.

标签: pythonpostgresqlpsycopg2

解决方案


您可以遍历查询结果并在行日期更改时打开一个新文件。结果必须按日期排序,否则可能会丢失一些数据。

import psycopg2
import psycopg2.extras
import csv
import datetime

# conn_string = ...
conn = psycopg2.connect(conn_string)

# we need results in dict
cur = conn.cursor(cursor_factory = psycopg2.extras.DictCursor)

# order by date - important!
query = "select * from sample where date between '2018-01-01' and '2018-01-10' order by date"
cur.execute(query)
title = [i[0] for i in cur.description]

date = None
writer = None
csvfile = None

for row in cur:
    if date != row['date']:
        # when date changes we should close current file (if opened)
        # and open a new one with name based on date
        if csvfile:
            csvfile.close()
        date = row['date']
        filename = date.strftime("%d%b")+ '.csv'
        csvfile = open(filename, 'w', newline='')
        writer = csv.writer(csvfile)
        writer.writerow(title)
    writer.writerow(row)

cur.close()
conn.close()

上述解决方案对于相当小的数据集是可以接受的。如果一天的数据量很大,你宁愿使用copy_expert()

cur = conn.cursor()

# example loop for ten days of Jan 2018
for day in range(1, 10):
    date = datetime.date(2018, 1, day)
    filename = date.strftime("%d%b")+ '.csv'
    command = 'copy (select * from sample where date = %s) to stdout with csv header'
    sql = cur.mogrify(command, [date])
    with open(filename, 'w') as file:
        cur.copy_expert(sql, file)

推荐阅读