python - 如何使用 Python 转义特定 .csv 列中的所有单双引号?
问题描述
- 使用 Python 2.7.6
- 需要不使用 Pandas 库的解决方案
我有一个带有特定(文本)列的 .csv 文件,其单元格偶尔会包含双引号 (")。在 ArcMap 中转换为 shapefile 时,这些单双引号会导致错误的转换。它们必须被“转义”。
我需要一个脚本来编辑 .csv 以便它:
- 用“”替换“”的所有实例。
- 将每个单元格用双引号引起来。
我的脚本:
import csv
with open(Source_CSV, 'r') as file1, open('OUTPUT2.csv','w') as file2:
reader = csv.reader(file1)
# Write column headers without quotes
headers = reader.next()
str1 = ''.join(headers)
writer = csv.writer(file2)
writer.writerow(headers)
# Write all other rows with quotes
writer = csv.writer(file2, quoting=csv.QUOTE_ALL)
for row in reader:
writer.writerow(row)
此脚本成功完成所有列 的上述两项任务。
例如,这个原始的 .csv:
Column 1, Column 2, Column 3, Column 4
Fred, Flintstone, 5'10", black hair
Wilma, Flintstone, five feet seven inches, red hair
Barney, Rubble, 5 feet 2" inches, blond hair
Betty, Rubble, 5 foot 7, black hair
变成这样:
Column 1, Column 2, Column 3, Column 4
"Fred"," Flintstone"," 5'10"""," black hair"
"Wilma"," Flintstone"," five feet seven inches"," red hair"
"Barney"," Rubble"," 5 feet 2"" inches"," blond hair"
"Betty"," Rubble"," 5 foot 7"," black hair"
但是,如果我只想在第3 列(实际上偶尔有双引号的那一列)中完成此操作,该怎么办?
换句话说,我怎么能得到这个……?
Column 1, Column 2, Column 3, Column 4
Fred, Flintstone," 5'10""", black hair
Wilma, Flintstone," five feet seven inches", red hair
Barney, Rubble," 5 feet 2"" inches", blond hair
Betty, Rubble," 5 foot 7", black hair
解决方案
仅引用包含双引号的字段就足够了吗?如果是这样,模块的默认行为csv
将起作用,尽管我skipinitialspace=True
在解析输入文件时添加了它,因此它不会将逗号后面的空格视为重要。
同样根据csv
模块文档,我以二进制模式打开了文件。
import csv
with open('input.csv','rb') as file1, open('output.csv','wb') as file2:
reader = csv.reader(file1,skipinitialspace=True)
writer = csv.writer(file2)
for row in reader:
writer.writerow(row)
输入:
Column 1, Column 2, Column 3, Column 4
Fred, Flintstone, 5'10", black hair
Wilma, Flintstone, five feet seven inches, red hair
Barney, Rubble, 5 feet 2" inches, blond hair
Betty, Rubble, 5 foot 7, black hair
输出:
Column 1,Column 2,Column 3,Column 4
Fred,Flintstone,"5'10""",black hair
Wilma,Flintstone,five feet seven inches,red hair
Barney,Rubble,"5 feet 2"" inches",blond hair
Betty,Rubble,5 foot 7,black hair
如果您需要引用第 3 列的每一行,则可以手动进行。我已将csv
模块设置为不引用任何内容,并将引号字符设置为不应出现在输入中的不可打印控制字符:
import csv
with open('input.csv','rb') as file1, open('output.csv','wb') as file2:
reader = csv.reader(file1,skipinitialspace=True)
writer = csv.writer(file2,quoting=csv.QUOTE_NONE,quotechar='\x01')
# Write column headers without quotes
headers = reader.next()
writer.writerow(headers)
# Write 3rd column with quotes
for row in reader:
row[2] = '"' + row[2].replace('"','""') + '"'
writer.writerow(row)
输出:
Column 1,Column 2,Column 3,Column 4
Fred,Flintstone,"5'10""",black hair
Wilma,Flintstone,"five feet seven inches",red hair
Barney,Rubble,"5 feet 2"" inches",blond hair
Betty,Rubble,"5 foot 7",black hair
推荐阅读
- sql - 在另一个空表中插入表的值
- git - Github 操作不会对所有提交运行测试
- javascript - Angular 无法在“RTCPeerConnection”上执行“setRemoteDescription”:无法设置远程应答 sdp:在错误状态下调用:稳定
- python - 两个向量之间的软余弦距离(Python)
- java - JTable 单元格中的 JComboBox 具有基于所选行的不同值
- firebase - 网站中firebase图像加载缓慢
- gurobi - 如何获取 gurobi 求解器的所有参数列表
- testing - TestCafe 测试中的 Microsoft 帐户。- “您的安全信息更改仍在等待中”
- arrays - 如何获取具有多个值的数组字段的弹性文档?
- android-networking - 为什么 Socket.setReuseAddress() 偶尔会抛出“socket failed: EMFILE (Too many open files)”