首页 > 解决方案 > 无法解决零星的拆分记录问题

问题描述

来自我的 CRM 的原始数据偶尔会在 csv 文件中将记录拆分为 2 行。

来自 CRM 的原始数据

来自 CRM 的原始数据

该代码以所需的格式准备数据以上传到我的会计系统中。

遗憾的是,我无法在导入时或导入后找到解决此问题所需的代码或命令。

任何帮助将不胜感激。

# importing csv module 
import csv

# csv file name
filename = "Sales Document Data-0.csv"

# initializing the titles and rows list
fields = []
rows = []

# reading csv file
with open(filename, 'r') as csvfile:
# creating a csv reader object
csvreader = csv.reader(csvfile)
fields= next(csvreader)
header= list(fields)

# inserting missing header fields
header.insert(1,'EmailAddress')
header.insert(2,'POAddressLine1')
header.insert(3,'POAddressLine2')
header.insert(4,'POAddressLine3')
header.insert(5,'POAddressLine4')
header.insert(6,'POCity')
header.insert(7,'PORegion')
header.insert(8,'POPostalCode')
header.insert(9,'POCounty')
header.insert(15,'InventoryItemCode')
header.insert(23,'TrackingName1')
header.insert(24,'TrackingOption1')
header.insert(25,'TrackingName2')
header.insert(26,'TrackingOption2')
header.insert(28,'InvoicesBrandingTheme')

# remove headers related to description building
header.pop(-1)
header.pop(-1)
header.pop(-1)
#print(header)

# create empty csv file
with open("InvUpload.csv", "w", newline='') as InvUpload:
    #pass
    writer = csv.writer(InvUpload, delimiter=',', quoting=csv.QUOTE_MINIMAL)
    writer.writerow(header)

    # extracting field names through first row
#fields = next(csvreader)
#print(fields)
# extracting each data row one by one
for row in csvreader:
    rows= list(row)
    rows.insert(1,'')
    rows.insert(2,'')
    rows.insert(3,'')
    rows.insert(4,'')
    rows.insert(5,'')
    rows.insert(6,'')
    rows.insert(7,'')
    rows.insert(8,'')
    rows.insert(9,'')
    rows.insert(15,'')
    rows.insert(23,'')
    rows.insert(24,'')
    rows.insert(25,'')
    rows.insert(26,'')
    rows.insert(28,'BPMT')
    # change description, if 'Standard daily rate'

    ts = rows[30]

    try:
        if rows.index('Standard daily rate'):
            rows[16] = (rows[31]+': '+ts.removesuffix('.pdf'))
        else:
            rows.split('\n')
    except:
        pass
    try:
        if rows.index('Base pay rate'):
            rows[16] = (rows[31]+': '+ts.removesuffix('.pdf'))
        else:
            rows.split('\n')
    except:
        pass
    try:
        if rows.index('Hours not applied'):
            rows[16] = (rows[31]+': '+ts.removesuffix('.pdf'))
        else:
            rows.split('\n')
    except:
        pass
    try:
        if rows.index('Unnamed'):
            rows[16] = (rows[31]+': '+ts.removesuffix('.pdf'))
        else:
            rows.split('\n')
    except:
        pass

    # remove data related to description building
    rows.pop(-1)
    rows.pop(-1)
    rows.pop(-1)
    # chnage TaxType to Xero format
    try:
        if rows.index('20.0%'):
            rows[21] = ('20% (VAT on Income)')
        else:
            rows[21] = ('No VAT')
    except:
        pass
    #print(rows)
    # append to previously created csv file
    if rows[14] == '0.00':
        pass
    else:
        with open("InvUpload.csv", "a", newline='') as InvUpload:
            writer = csv.writer(InvUpload, delimiter=',',
                        quotechar='|')
            writer.writerow(rows)
    # message to show file creation
print('File InvUpload.csv has been created')

标签: pythonpython-3.xpython-2.7

解决方案


尝试使用 pandas 来处理和构建一个 csv,这将使您在未来减少头痛。但是在这种情况下,您应该逐行阅读,如果您没有找到补充引号,您应该附加下一行。由于您正在打开要阅读的文件,因此很容易检查是否缺少报价。

抱歉,如果我没有为您提供太多帮助,但我不了解数据文件(原始数据)与您的代码之间的相关性。你能把一些代码拆分成函数吗?将更容易可视化您的代码


推荐阅读