首页 > 解决方案 > 如何在使用 Python 加载到 MySQL 时将带有额外分隔符的 csv 行复制到另一个文件

问题描述

我正在处理一个棘手的代码。客户端希望将数据加载到 MySQL,但不包括具有额外分隔符或有回车问题的行。我能够使用以下代码完成加载部分:

#!/usr/bin/env python
# coding: utf-8

# In[1]:


import string
import csv
import mysql.connector
import re


# In[2]:


mydb=mysql.connector.connect(host="localhost",user="root",password="password",autocommit=True)
mycursor = mydb.cursor()


# In[3]:


sql_str=''
sql_str1=''
placeholder=''
field_name=[]
line=1
i_holder=0
file_name=input("\nFile name with extension:")
delimiter_=input('\nDelimiter used:')
textquali=input('\nText Qualifier used Press Enter if no qualifier:')


# In[4]:


mycursor.execute("drop table if exists  rd.pbc_gl_sigline_src")
#r"C:\Users\rcsid\Documents\Office Programs\Working prog\MOCK_DATA.csv"
#re.sub('[^a-zA-Z0-9]\n\.', '_', row)
reader = csv.DictReader(open(file_name,encoding='utf-8',errors='ignore'), delimiter=delimiter_,quotechar=textquali)
rowHeaders = reader.fieldnames
#print(rowHeaders)
for i in rowHeaders:
    field_name.append(re.sub('[^A-Za-z0-9]+', '_', i))
#print(field_name)
#print(f'''create table rd.data_with_header ( {" varchar(100), ".join(field_name)} varchar(100))''')
sql_str=f'''create table rd.pbc_gl_sigline_src ( {" varchar(250), ".join(field_name)} varchar(100))'''
mycursor.execute(sql_str)
placeholders=", ".join(['%s']*len(field_name))
sql_str1=f"""insert into rd.pbc_gl_sigline_src values ({placeholders})"""
for row in reader:
    if any(val is None for val in row.values()):
        print(row)
    line=line+1
    #print(row)
    print(line)
    mycursor.execute(sql_str1,tuple(row.values()))
print("Loaded to pbc_gl_sigline_src")

#if any(val in (None, "") for val in row.itervalues()):

# In[ ]:

在上面的代码中,Python 正在加载数据并打印有回车问题的行(行移动到下一行)。问题是我无法打印带有额外分隔符的行。正在生成以下错误:

Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\lib\site-packages\mysql\connector\conversion.py", line 179, in to_mysql
    return getattr(self, "_{0}_to_mysql".format(type_name))(value)
AttributeError: 'MySQLConverter' object has no attribute '_list_to_mysql'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\lib\site-packages\mysql\connector\cursor.py", line 425, in _process_params
    res = [to_mysql(i) for i in res]
  File "C:\ProgramData\Anaconda3\lib\site-packages\mysql\connector\cursor.py", line 425, in <listcomp>
    res = [to_mysql(i) for i in res]
  File "C:\ProgramData\Anaconda3\lib\site-packages\mysql\connector\conversion.py", line 182, in to_mysql
    "MySQL type".format(type_name))
TypeError: Python 'list' cannot be converted to a MySQL type

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "SingleLineLoader_WithHeader_v3_dictwrite_notfinal.py", line 61, in <module>
    mycursor.execute(sql_str1,tuple(row.values()))
  File "C:\ProgramData\Anaconda3\lib\site-packages\mysql\connector\cursor.py", line 547, in execute
    psub = _ParamSubstitutor(self._process_params(params))
  File "C:\ProgramData\Anaconda3\lib\site-packages\mysql\connector\cursor.py", line 430, in _process_params
    "Failed processing format-parameters; %s" % err)
mysql.connector.errors.ProgrammingError: Failed processing format-parameters; Python 'list' cannot be converted to a MySQL type

如果没有额外的定界符,那么代码工作正常并打印带有回车行的行。我的主要工作是加载正常的行并将有问题的行复制到一个新文件中,客户可以在其中查看它们。客户端只想在他们将从另一个客户端获得的数据上运行代码。所以这里有两个问题。首先是从加载中排除有问题的行,然后将它们复制到另一个文件。是否有任何尝试/除外方法或任何解决方法?我很感激你的帮助。

此外,如果在加载数据期间没有排除这些行并被加载但被复制到一个也可以工作的新文件中。客户端可以从数据库中排除这些行。

标签: pythonmysqlcsv

解决方案


推荐阅读