python - 使用python将单个.csv导入mysql
问题描述
运行此代码时,我得到一个Error while connecting to MySQL Not all parameters were used in the SQL statement
我也尝试过用另一种技术来摄取这些
import mysql.connector as msql
from mysql.connector import Error
import pandas as pd
empdata = pd.read_csv('path_to_file', index_col=False, delimiter = ',')
empdata.head()
try:
conn = msql.connect(host='localhost', user='test345',
password='test123')
if conn.is_connected():
cursor = conn.cursor()
cursor.execute("CREATE DATABASE timetheft")
print("Database is created")
except Error as e:
print("Error while connecting to MySQL", e)
try:
conn = msql.connect(host='localhost', database='timetheft', user='test345', password='test123')
if conn.is_connected():
cursor = conn.cursor()
cursor.execute("select database();")
record = cursor.fetchone()
print("You're connected to database: ", record)
cursor.execute('DROP TABLE IF EXISTS company;')
print('Creating table....')
create_contracts_table = """
CREATE TABLE company ( ID VARCHAR(40) PRIMARY KEY,
Company_Name VARCHAR(40),
Country VARCHAR(40),
City VARCHAR(40),
Email VARCHAR(40),
Industry VARCHAR(30),
Employees VARCHAR(30)
);
"""
cursor.execute(create_company_table)
print("Table is created....")
for i,row in empdata.iterrows():
sql = "INSERT INTO timetheft.company VALUES (%S, %S, %S, %S, %S,%S,%S,%S)"
cursor.execute(sql, tuple(row))
print("Record inserted")
# the connection is not auto committed by default, so we must commit to save our changes
conn.commit()
except Error as e:
print("Error while connecting to MySQL", e)
我尝试的第二种技术
LOAD DATA LOCAL INFILE 'path_to_file'
INTO TABLE copmany
FIELDS TERMINATED BY ';'
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES;
工作得更好,但有很多错误。仅摄取了 20% 的行。
最后这里是 .csv 的摘录(数据在所有 1K 行中都是一致的)
“ID”;“Company_Name”;“Country”;“City”;“Email”;“Industry”;“Employees” 217520699;“Enim Corp.”;“Germany”;“Bamberg”;“posuere@diamvel.edu” ;"Internet";"51-100" 352428999;"Lacus Vestibulum Consulting";"德国";"Villingen-Schwenningen";"egestas@lacusEtiambibendum.org";"食品生产";"100-500" 371718299;"Dictum Ultricies Ltd";"Germany";"Anklam";"convallis.erat@sempercursus.co.uk";"Primary/Secondary Education";"100-500" 676789799;"A Consulting";"Germany";"Andernach" ;"massa@etrisusQuisque.ca";"政府关系";"100-500" 718526699;"Odio LLP";"德国";"Eisenhüttenstadt";"Quisque.varius@euismod.org";"E-Learning";"11-50"
解决方案
我修复了这些问题以使代码正常工作:
- 使插入语句中的占位符数等于列数
- 占位符应该是小写的
'%s'
- 单元格分隔符似乎是分号,而不是逗号。
对于简单地读取约 1000 行 Pandas 的 csv 是多余的(而且 iterrows 似乎不像你预期的那样表现)。我使用了标准库中的csv模块。
import csv
...
sql = "INSERT INTO company VALUES (%s, %s, %s, %s, %s, %s, %s)"
with open("67359903.csv", "r", newline="") as f:
reader = csv.reader(f, delimiter=";")
# Skip the header row.
next(reader)
# For large files it may be more efficient to commit
# rows in batches.
cursor.executemany(sql, reader)
conn.commit()
如果使用该csv
模块不方便,可以使用数据帧的 itertuples方法对数据进行迭代:
empdata = pd.read_csv('67359903.csv', index_col=False, delimiter=';')
for tuple_ in empdata.itertuples(index=False):
cursor.execute(sql, tuple_)
conn.commit()
或者数据框可以直接转储到数据库中。
import sqlalchemy as sa
engine = sa.create_engine('mysql+mysqlconnector:///test')
empdata.to_sql('company', engine, index=False, if_exists='replace')
推荐阅读
- javascript - 有人可以解释一下这段代码背后的过程吗?查找arrayList中的最小数字
- pandas - 满足条件时删除所有组行?
- excel - 擦除不确定大小的行内的重复值
- android - 如何在 android 中使用 Azure 认知翻译服务?
- php - Laravel 集合在 foreach 循环中变为空
- sparql - 如何通过 SPARQL 查询从 Ontology 中的子类中获取值?
- php - 如何使用 mysqli group_concat 创建嵌套 JSON?
- git - 拉取请求冲突
- c++ - 使用二维阵列进行路径规划
- java - 如何使用 apache poi 在文档 .docx 中的表格单元格中创建 TextBox