sql - CSV 到 SQL 导入会随意重新排序 CSV 数据 为什么?
问题描述
当我尝试将 CSV 文件导入我的 sql server 表时。行数据会自动重新排序。为什么?
我的 sql 表的架构是:
创建表 ZCPTble(EmployeeID INT,EmployeeName NVARCHAR(255),ProjectID NVARCHAR(255),WBSElement NVARCHAR(255),AllocationPercent FLOAT,ProjSBUName NVARCHAR(255),ProjectPractice NVARCHAR(255),ProjectProfile NVARCHAR(255),ProjectManagerID INT,ProjectManager NVARCHAR(255)、DeliveryManagerID INT、DeliveryManager NVARCHAR(255)、SupervisorID INT、Supervisor NVARCHAR(255)、
EmployeeStatus NVARCHAR(255)、BusinessLead NVARCHAR(255)、BusinessLeadName NVARCHAR(255)
import glob
import os
import csv
import pyodbc
import time
from datetime import datetime
import pandas as pd
import datetime as dt
import urllib
from sqlalchemy import create_engine
def trim(dataset):
trim = lambda x: x.strip() if type(x) is str else x
return dataset.applymap(trim)
conn1 = pyodbc.connect(r'Driver={SQL Server};'
r'Server=XXX\SQLSERVER2017;'
r'Database=ABC;'
r'Trusted_Connection=yes;')
cur1 = conn1.cursor()
start = time.time()
# Names of all columns
list_of_column_names = {
'EmployeeID',
'EmployeeName',
'ProjectID',
'WBSElement',
'AllocationPercent',
'ProjSBUName',
'ProjectPractice',
'ProjectProfile',
'ProjectManagerID',
'ProjectManager',
'DeliveryManagerID',
'DeliveryManager',
'SupervisorID',
'Supervisor',
'EmployeeStatus',
'BusinessLead',
'BusinessLeadName'
}
print(len(list_of_column_names))
df = pd.DataFrame()
# impoting newest csv file
list_of_files = glob.glob(
'C://Users/ABX/SDD/ASY/Sample/*') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getctime)
for files in os.listdir("C://Users/ABX/SDD/ASY/Sample/"):
print(files)
with open(latest_file, "r") as file:
df = pd.read_csv(file)
df = df[df.columns.drop(list(df.filter(regex='Unnamed: ')))]
# Rename all columns
df.columns = list_of_column_names
#converting all numeric to numeric type
try:
for col in ['EmployeeID', 'AllocationPercent', 'SupervisorID', 'ProjectManagerID', 'DeliveryManagerID']:
df[col] = df[col].apply(pd.to_numeric)
except KeyError:
continue
#writing to sql server
params = urllib.parse.quote_plus(
r'DRIVER={SQL Server};SERVER=XXX1\SQLSERVER2017;DATABASE=ABC;Trusted_Connection=yes')
conn_str = 'mssql+pyodbc:///?odbc_connect={}'.format(params)
engine = create_engine(conn_str)
df.to_sql(name='ProjectAllocationDetails', con=engine, if_exists='append', index=False)
print("The Data was inserted succesfully.")
end = time.time()
# release memory from the dataframe
del df
print(end - start)
数据随意变化。例如,sql 表的 Employee ID 列有 Employee Name 而不是 Date 列得到名称。为什么会这样?
解决方案
更新:问题已解决。代码中的一个小错误。变量 list_of_column_names 应该是一个列表对象。代码中使用的括号是错误的。
推荐阅读
- python - Best Practice for defining method type in python
- dart - Flutter pass User instance to another state and keep it updated
- java - 获取索引位置标签
- php - 如何重建一个sql索引
- spring-boot - 带有 MyBatis 的 Spring Data JDBC 找不到自定义查询
- ios - 无法在 iOS Swift 中使用资产运行动画
- c# - Copy values with same name across classes
- react-native - 要求循环是允许的,但可能导致未初始化的值。考虑重构以消除对循环的需求
- java - Java:比较并替换字符串中的字符
- vb.net - 无法更改 datagridview 内的图像按钮