首页 > 解决方案 > 我们如何将 Pandas 数据框中的所有内容插入到 SQL Server 的表中?

问题描述

我使用 SQL Server 和 Python 已经好几年了,也使用过 Insert Into 和 df.iterrows,但我从未尝试将数据框的所有内容推送到 SQL Server 表中。我现在正在处理一些更大的数据集,我想找到一种有效的方法将数据框中的所有内容移动到 SQL Server 中的表中。

我正在测试这段代码。

# first I loop through a few files and append everything to a list
# this works fine

# convert the list to a data frame
df_append = DataFrame(df_append)
df_append.shape
type(df_append)

# log into DB
import pyodbc
driver= '{SQL Server Native Client 11.0}'

conn_str = (
    r'DRIVER={SQL Server};'
    r'SERVER=LAPTOP-CEDUMII6;'
    r'DATABASE=TestDB;'
    r'Trusted_Connection=yes;'
)
cnxn = pyodbc.connect(conn_str)

cursor = cnxn.cursor()
cursor.execute('SELECT * FROM FFIEC_CDR_Call_Schedule_RIBII')

for row in cursor:
    print('row = %r' % (row,))

# can log into the DB just fine...
# now I am trying to move the contents of the data frame to the table...

# Here is attempt #1...
df_append.to_sql('FFIEC_CDR_Call_Schedule_RIBII', cnxn, index=False, if_exists='replace')

# Error:
df_append.to_sql('FFIEC_CDR_Call_Schedule_RIBII', cnxn, index=False, if_exists='replace')
Traceback (most recent call last):

  File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1681, in execute
    cur.execute(*args, **kwargs)

ProgrammingError: ('42S02', "[42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Invalid object name 'sqlite_master'. (208) (SQLExecDirectW); [42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Statement(s) could not be prepared. (8180)")


The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  File "<ipython-input-87-2d90babfc8a7>", line 1, in <module>
    df_append.to_sql('FFIEC_CDR_Call_Schedule_RIBII', cnxn, index=False, if_exists='replace')

  File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\core\generic.py", line 2615, in to_sql
    method=method,

  File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 598, in to_sql
    method=method,

  File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1827, in to_sql
    table.create()

  File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 721, in create
    if self.exists():

  File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 708, in exists
    return self.pd_sql.has_table(self.name, self.schema)

  File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1838, in has_table
    return len(self.execute(query, [name]).fetchall()) > 0

  File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1693, in execute
    raise ex from exc

DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': ('42S02', "[42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Invalid object name 'sqlite_master'. (208) (SQLExecDirectW); [42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Statement(s) could not be prepared. (8180)")

# Here is attempt #2...same error...
df_append.to_sql('FFIEC_CDR_Call_Schedule_RIBII', schema='dbo', con = cnxn)

我在发布之前对此进行了研究,看起来它是可行的。我的代码中的某些内容必须关闭,但我不知道出了什么问题。如果有人在这里看到错误,请告诉我。

标签: pythonpython-3.xdataframepyodbc

解决方案


pandasto_sql肯定是你要找的。它的文档con参数可以是

sqlalchemy.engine.(引擎或连接)或 sqlite3.Connection

并且“为 sqlite3.Connection 对象提供了旧版支持。”。因此to_sql,查看您传递的内容con,如果它不是 SQLAlchemy 可连接(引擎或连接),则to_sql 假定它是 sqlite3.Connection。您传递了一个 pyodbc.Connection,它to_sql被误解为 sqlite3.Connection,由此产生的错误是

[42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]无效的对象名称“sqlite_master”。

解决方案是创建一个 SQLAlchemy Engine 对象,如此处所述,然后将该 Engine 对象传递给to_sql.

ps 对于 SQL Server,请记住使用fast_executemany=True,例如,

engine = create_engine(connection_uri, fast_executemany=True)
df.to_sql(table_name, engine, …)

推荐阅读