psycopg2 - 从 CSV 批量加载到 PostGres 时出错
问题描述
import pandas
import pygrametl
import psycopg2
from pygrametl.tables import SlowlyChangingDimension,CachedDimension,BulkDimension
from pygrametl.datasources import CSVSource
##Connection to PostGres
connection = psycopg2.connect(host="localhost",database="postgres", user="postgres",
password="tekihcan")
connect = pygrametl.ConnectionWrapper(connection)
def pgcopybulkloader(name, atts, fieldsep, rowsep, nullval, filehandle):
# Here we use driver-specific code to get fast bulk loading.
# You can change this method if you use another driver or you can
# use the FactTable or BatchFactTable classes (which don't require
# use of driver-specifc code) instead of the BulkFactTable class.
global connection
curs = connect.cursor()
try:
curs.copy_from(file=filehandle, table=name, sep=fieldsep,
columns=atts,null='null')
except(Exception, psycopg2.Database) as error:
print("Error %s" % error)
date_dim = BulkDimension(name='date_dim',key='d_date_sk',attributes=[
'd_date_id (B)'
,'d_date'
,'d_month_seq'
,'d_week_seq'
,'d_quarter_seq'
,'d_year'
,'d_dow'
,'d_moy'
,'d_dom'
,'d_qoy'
,'d_fy_year'
,'d_fy_quarter_seq'
,'d_fy_week_seq'
,'d_day_name'
,'d_quarter_name'
,'d_holiday'
,'d_weekend'
,'d_following_holiday'
,'d_first_dom'
,'d_last_dom'
,'d_same_day_ly'
,'d_same_day_lq'
,'d_current_day'
,'d_current_week'
,'d_current_month'
,'d_current_quarter'
,'d_current_year'
],lookupatts = ['d_date_id (B)'],
bulkloader = pgcopybulkloader)
date_dim_source = CSVSource(open('C:/Users/HP\Documents/v2.13.0rc1/data/date_dim.csv',
'r', 16384),delimiter='|')
def main():
for row in date_dim_source:
date_dim.insert(row)
代码因错误而失败 -
据我了解,错误是由于目标表为空而引起的。CSV 源也没有标题。这会影响代码吗?请找到用于开发代码的链接 - https://chrthomsen.github.io/pygrametl/
解决方案
推荐阅读
- docker - 无法连接到从 docker 映像创建的本地数据库
- angular - 如何使用 ngOnInit 测试 http 调用的承诺所期望的第二个值
- python - 我们如何从数据框中获取最小日期,每年和每个 ID?
- apache-spark - 根据先前的值和行 Pyspark 填充列
- javascript - HLS video quality selector in React.js
- css - typekit 返回状态码 412 前置条件失败
- javascript - 如何手动更改 Parcel 中的文件名?
- python - 使用 Pandas 数据框覆盖现有工作表上的现有 Excel 数据?
- json - 从 Scala 中 json 的字符串表示形式的任何键值中获取值(使用 scala.util.parsing.json)
- powerbi - 使用 Power BI 条件格式