首页 > 解决方案 > pandas 无法写入 Postgres db 抛出“KeyError: ("SELECT name FROM sqlite_master ..."

问题描述

我创建了一个包,允许用户将数据写入 sqlite 或 Postgres db。我创建了一个用于连接数据库的模块和一个提供写入功能的单独模块。在后一个模块中,write 是一个简单的 pandas 内部函数调用:

indata.to_sql('pay_' + table, con, if_exists='append', index=False)

写入sqlite db(使用'sqlite3'连接)是成功的,但是在写入Postgres db时我收到以下错误:

Traceback (most recent call last):
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 1778, in execute
    ps = cache['ps'][key]
KeyError: ("SELECT name FROM sqlite_master WHERE type='table' AND name=?;", ((705, 0, <function Connection.__init__.<locals>.text_out at 0x7fc3205fb510>),))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pandas/io/sql.py", line 1595, in execute
    cur.execute(*args)
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 861, in execute
    self._c.execute(self, operation, args)
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 1837, in execute
    self.handle_messages(cursor)
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 1976, in handle_messages
    raise self.error
pg8000.core.ProgrammingError: {'S': 'ERROR', 'V': 'ERROR', 'C': '42P01', 'M': 'relation "sqlite_master" does not exist', 'P': '18', 'F': 'parse_relation.c', 'L': '1180', 'R': 'parserOpenTable'}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pandas/io/sql.py", line 1610, in execute
    raise_with_traceback(ex)
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pandas/compat/__init__.py", line 46, in raise_with_traceback
    raise exc.with_traceback(traceback)
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pandas/io/sql.py", line 1595, in execute
    cur.execute(*args)
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 861, in execute
    self._c.execute(self, operation, args)
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 1837, in execute
    self.handle_messages(cursor)
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 1976, in handle_messages
    raise self.error
pandas.io.sql.DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': {'S': 'ERROR', 'V': 'ERROR', 'C': '42P01', 'M': 'relation "sqlite_master" does not exist', 'P': '18', 'F': 'parse_relation.c', 'L': '1180', 'R': 'parserOpenTable'}

我将错误跟踪到以下文件:

/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pandas/io/sql.py

似乎正在发生的事情是“.to_sql”函数被配置为尝试在“sql.py”文件中此时写入名为“sqlite_master”的数据库:

    def has_table(self, name, schema=None):
    # TODO(wesm): unused?
    # escape = _get_valid_sqlite_name
    # esc_name = escape(name)

    wld = "?"
    query = (
        "SELECT name FROM sqlite_master " "WHERE type='table' AND name={wld};"
    ).format(wld=wld)

    return len(self.execute(query, [name]).fetchall()) > 0

更仔细地查看错误,您可以看到与 db 的连接正确,但 pandas 正在寻找一个 sqlite db:

在此处输入图像描述

我知道数据库名称是我半年前第一次开始使用 sqlite 时使用的名称,所以我想在某个地方设置了配置值。所以:

  1. 我的推理正确吗?
  2. 如果是这样,我该如何更改配置?
  3. 如果没有,可能发生了什么?

标签: pandassqlitepycharmanacondapg8000

解决方案


根据pandas.DataFrame.to_sql文档:

con : sqlalchemy.engine.Engine 或 sqlite3.Connection

使用 SQLAlchemy 可以使用该库支持的任何数据库。为 sqlite3.Connection 对象提供了旧版支持。

这意味着只有 SQLite 允许该to_sql方法的原始连接。包括 Postgres 在内的所有其他 RDBM 必须为此方法使用 SQLAlchemy 连接来创建结构和附加数据。请注意:read_sql不需要 SQLAlchemy,因为它不会进行持久更改。

因此,此原始 DB-API 连接无法工作:

import psycopg2
con = psycopg2.connect(host="localhost", port=5432, dbname="mydb", user="myuser", password="mypwd")

indata.to_sql('pay_' + table, con, if_exists='append', index=False)

但是,这个SQLAlchemy连接可以工作:

from sqlalchemy import create_engine    

engine = create_engine('postgresql+psycopg2://myuser:mypwd@localhost:5432/mydb')

indata.to_sql('pay_' + table, engine, if_exists='append', index=False)

更好地将 SQLAlchemy 用于两个数据库,这里用于SQLite

engine = create_engine("sqlite:///path/to/mydb.db")

推荐阅读