首页 > 解决方案 > 将csv转换为xml的python脚本

问题描述

请帮助更正python脚本以获得所需的输出

我在下面编写了将 csv 转换为 xml 的代码。在输入文件中有从 1 到 278 的列。在输出文件中需要有从 A1 到 A278 的标签,

代码 :

#!/usr/bin/python
import sys
import os
import csv
if len(sys.argv) != 2:
    os._exit(1)
path=sys.argv[1] # get folder as a command line argument
os.chdir(path)
csvFiles = [f for f in os.listdir('.') if f.endswith('.csv') or f.endswith('.CSV')]
for csvFile in csvFiles:
    xmlFile = csvFile[:-4] + '.xml'
    csvData = csv.reader(open(csvFile))
    xmlData = open(xmlFile, 'w')
    xmlData.write('<?xml version="1.0"?>' + "\n")
    # there must be only one top-level tag
    xmlData.write('<TariffRecords>' + "\n")
    rowNum = 0
    for row in csvData:
        if rowNum == 0:
            tags = Tariff
            # replace spaces w/ underscores in tag names
            for i in range(len(tags)):
                tags[i] = tags[i].replace(' ', '_')
        else:
            xmlData.write('<Tariff>' + "\n")
            for i in range(len(tags)):
                xmlData.write('    ' + '<' + tags[i] + '>' \
                              + row[i] + '</' + tags[i] + '>' + "\n")
            xmlData.write('</Tariff>' + "\n")
        rowNum +=1
    xmlData.write('</TariffRecords>' + "\n")
    xmlData.close()

从脚本中得到以下错误:

Traceback (most recent call last):
  File "ctox.py", line 20, in ?
    tags = Tariff
NameError: name 'Tariff' is not defined

示例输入文件。(这是实际输入文件中的示例记录,将包含 278 列)。如果输入文件有两个或三个记录,则需要在一个 XML 文件中附加相同的记录。

name,Tariff Summary,Record ID No.,Operator Name,Circle (Service Area),list
Prepaid Plan Voucher,test_All calls 2p/s,TT07PMPV0188,Ta Te,Gu,
Prepaid Plan Voucher,test_All calls 3p/s,TT07PMPV0189,Ta Te,HR,

示例输出文件 上述两个 TariffRecords,关税将在 xml 文件的开头和结尾进行硬编码。

<TariffRecords>
<Tariff>
<A1>Prepaid Plan Voucher</A1>
<A2>test_All calls 2p/s</A2>
<A3>TT07PMPV0188</A3>
<A4>Ta Te</A4>
<A5>Gu</A5>
<A6></A6>
</Tariff>
<Tariff>
<A1>Prepaid Plan Voucher</A1>
<A2>test_All calls 3p/s</A2>
<A3>TT07PMPV0189</A3>
<A4>Ta Te</A4>
<A5>HR</A5>
<A6></A6>
</Tariff>
</TariffRecords>

标签: python

解决方案


首先你需要更换

tags = Tarifftags = row

其次,您要替换写行以不写标签名称而是写 A1、A2 等。

完整代码:

import sys
import os
import csv
if len(sys.argv) != 2:
    os._exit(1)
path=sys.argv[1] # get folder as a command line argument
os.chdir(path)
csvFiles = [f for f in os.listdir('.') if f.endswith('.csv') or f.endswith('.CSV')]
for csvFile in csvFiles:
    xmlFile = csvFile[:-4] + '.xml'
    csvData = csv.reader(open(csvFile))
    xmlData = open(xmlFile, 'w')
    xmlData.write('<?xml version="1.0"?>' + "\n")
    # there must be only one top-level tag
    xmlData.write('<TariffRecords>' + "\n")
    rowNum = 0
    for row in csvData:
        if rowNum == 0:
            tags = row
            # replace spaces w/ underscores in tag names
            for i in range(len(tags)):
                tags[i] = tags[i].replace(' ', '_')
        else:
            xmlData.write('<Tariff>' + "\n")
            for i, index in enumerate(range(len(tags))):
                xmlData.write('    ' + '<' + 'A%s' % (index+1) + '>' \
                              + row[i] + '</' + 'A%s' % (index+1) + '>' + "\n")
            xmlData.write('</Tariff>' + "\n")
        rowNum +=1
    xmlData.write('</TariffRecords>' + "\n")
    xmlData.close()

输出:

<?xml version="1.0"?>
<TariffRecords>
<Tariff>
    <A1>Prepaid Plan Voucher</A1>
    <A2>test_All calls 2p/s</A2>
    <A3>TT07PMPV0188</A3>
    <A4>Ta Te</A4>
    <A5>Gu</A5>
    <A6></A6>
</Tariff>
<Tariff>
    <A1>Prepaid Plan Voucher</A1>
    <A2>test_All calls 3p/s</A2>
    <A3>TT07PMPV0189</A3>
    <A4>Ta Te</A4>
    <A5>HR</A5>
    <A6></A6>
</Tariff>
</TariffRecords>

推荐阅读