python - How to use pandas.to_sql but only add row if row doesn't exist yet
问题描述
I have some experience with python but very new to the SQL thing and trying to use pandas.to_sql to add table data into my database, but when I add I want it to check if the data exists before append
This are my 2 dataframes
>>> df0.to_markdown()
| | Col1 | Col2 |
|---:|-------:|-------:|
| 0 | 0 | 00 |
| 1 | 1 | 11 |
>>> df1.to_markdown()
| | Col1 | Col2 |
|---:|-------:|-------:|
| 0 | 0 | 00 |
| 1 | 1 | 11 |
| 2 | 2 | 22 |
So here I use the pandas to_sql
>>> df0.to_sql(con=con, name='test_db', if_exists='append', index=False)
>>> df1.to_sql(con=con, name='test_db', if_exists='append', index=False)
Here I check my data inside the database file
>>> df_out = pd.read_sql("""SELECT * FROM test_db""", con)
>>> df_out.to_markdown()
| | Col1 | Col2 |
|---:|-------:|-------:|
| 0 | 0 | 0 |
| 1 | 1 | 11 |
| 2 | 0 | 0 | # Duplicate
| 3 | 1 | 11 | # Duplicate
| 4 | 2 | 22 |
But I want my database to look like this, so I don't want to add the duplicate data to my database
| | Col1 | Col2 |
|---:|-------:|-------:|
| 0 | 0 | 0 |
| 1 | 1 | 11 |
| 3 | 2 | 22 |
Is there any option I can set or some line of code to add to make this happend?
Thankyou!
edit: There are some SQL code to only pull unique data, but what I want to do is don't add the data to the database in the first place
解决方案
不要使用 to_sql 一个简单的查询可以工作
query = text(f""" INSERT INTO test_db VALUES {','.join([str(i) for i in list(df0.to_records(index=False))])} ON CONFLICT ON CONSTRAINT test_db_pkey DO NOTHING""")
self.engine.connect().execute(query)
对于每个 DataFrame 将 df0 更改为 df1
点击这些链接以获得更好的理解
推荐阅读
- python-2.7 - 如何在 MAYA 中的 python 脚本之间传递参数?
- python - 为什么子线程无法访问 flask_login 中的 current_user 变量?
- steam - Steam 获取 cs:go 的用户匹配项
- php - 使用 MySQL 和 php 的 HTML 中的动态表
- java - HttpServlet 类中的 init() 函数导致问题
- python - Python:从附加到单个列表的三个单独列表中获取唯一组合?
- layout - 如何摆脱弹出菜单中的空白?
- xcode - UIPicker 未在 Mac Catalyst 中显示
- python - 如何在 PyQt5 Python 中自动换行 QTableWidget 的标题内容
- php - Phpspreadsheet:未捕获的 InvalidArgumentException 文件“filename.xlsx”不存在