python - 如何使用修改过的熊猫数据框的 csv 更新 postgreSQL?
问题描述
以下代码有效:
import pandas as pd
import csv
import psycopg2
df = pd.read_csv(r'https://developers.google.com/adwords/api/docs/appendix/geo/geotargets-2021-02-24.csv')
df=df.rename(columns = {'Criteria ID':'Criteria_ID','Canonical Name':'Canonical_Name','Parent ID':'Parent_ID','Country Code':'Country_Code','Target Type':'Target_Type'})
df = df.loc[df['Country_Code']=='IN']
df.to_csv(r'C:\Users\Harshal\Desktop\tar.csv',index=False)
conn = psycopg2.connect(host='1.11.11.111',
dbname='postgres',
user='postgres',
password='myPassword',
port='1234')
cur = conn.cursor()
f = open('C:\Users\Harshal\Desktop\tar.csv', 'r')
cur.copy_expert("""copy geotargets_india from stdin with (format csv, header, delimiter ',', quote '"')""", f)
conn.commit()
conn.close()
f.close()
但是我不想保存更改的数据框,而是直接将其上传到 postgreSQL 表中。我试过cur.copy_expert("""copy geotargets_india from stdin with (format csv, header, delimiter ',', quote '"')""", df)
了,但它会引发错误。注意:cur.copy_expert("""copy geotargets_india from stdin with (format csv, header, delimiter ',', quote '"')""", f)
无法避免,因为我在某些条件下保存 csv。我的表结构:
create table public.geotargets_india(
Criteria_ID integer not null,
Name character varying(50) COLLATE pg_catalog."default" NOT NULL,
Canonical_Name character varying(100) COLLATE pg_catalog."default" NOT NULL,
Parent_ID NUMERIC(10,2),
Country_Code character varying(10) COLLATE pg_catalog."default" NOT NULL,
Target_Type character varying(50) COLLATE pg_catalog."default" NOT NULL,
Status character varying(50) COLLATE pg_catalog."default" NOT NULL
)
编辑:我试过
import pandas as pd
import csv
import psycopg2
from sqlalchemy import create_engine
df = pd.read_csv(r'https://developers.google.com/adwords/api/docs/appendix/geo/geotargets-2021-02-24.csv')
df=df.rename(columns = {'Criteria ID':'Criteria_Id','Canonical Name':'Canonical_Name','Parent ID':'Parent_ID','Country Code':'Country_Code','Target Type':'Target_Type'})
df = df.loc[df['Country_Code']=='IN']
df['Canonical_Name']=df['Canonical_Name'].str.replace(',', " ")
engine = create_engine('postgresql+psycopg2://postgres:myPassword@1.11.11.111:1234/postgres')
df.to_sql(
'geotargets_india',
con=engine,
schema=None,
if_exists='append',
index=False
)
但出现错误:UndefinedColumn: column "Criteria_Id" of relation "geotargets_india" does not exist LINE 1: INSERT INTO geotargets_india ("Criteria_Id", "Name", "Canoni...
EDIT2:如果我删除表并且新表创建的脚本如下所示,则上述代码有效:
CREATE TABLE public.geotargets_india
(
"Criteria_Id" bigint,
"Name" text COLLATE pg_catalog."default",
"Canonical_Name" text COLLATE pg_catalog."default",
"Parent_ID" double precision,
"Country_Code" text COLLATE pg_catalog."default",
"Target_Type" text COLLATE pg_catalog."default",
"Status" text COLLATE pg_catalog."default"
)
为什么它不能使用预定义的表模式?
解决方案
我尝试了你的代码并更正了一些行,我的工作,
import pandas as pd
from sqlalchemy import create_engine
df = pd.read_csv(r'https://developers.google.com/adwords/api/docs/appendix/geo/geotargets-2021-02-24.csv', delimiter=',')
print(df)
df=df.rename(columns = {'Criteria ID':'Criteria_Id','Canonical Name':'Canonical_Name','Parent ID':'Parent_ID','Country Code':'Country_Code','Target Type':'Target_Type'})
df = df.loc[df['Country_Code']=='IN']
df['Canonical_Name']=df['Canonical_Name'].str.replace(',', " ")
engine = create_engine('postgresql+psycopg2://collaborateur1:nG@e3P@tapp581lv:2345/base_project')
df.to_sql('geotargets_india',con = engine,schema=None,if_exists='append',index=False)
我添加了分隔符 ',' 并更正了 'Canonical Name'
推荐阅读
- apache-kafka - 查询某个时间范围内的 Kafka Streams 状态存储
- powershell - 找出 o365 用户的直接分配许可证
- android - 如何禁用 XML 项目设置
- r - 将嵌套列表元素保存为 .rds
- amazon-web-services - aws cloudformation 错误:无法解压缩上传的文件。请检查您的文件,然后尝试重新上传
- php - PDF 内容从应用程序视图更改为下载的文件
- database - MongoDb replaceOne() 但需要很少的字段
- tcp - netfilter_queue ipv4 可选标头删除
- verilog - << 运算符合成的硬件是什么
- svn - SVN 子权限设置不会覆盖父路径上的