python - 从 csv 文件中删除字符和重复项并写入新文件
问题描述
我正在读取一个看起来像这样的 csv 文件:
[152.60115606936415][152.60115606936415, 13181.818181818182][152.60115606936415, 13181.818181818182, 1375055.330634278][152.60115606936415, 13181.818181818182, 1375055.330634278, 89.06882591093118]
我想要做的是删除字符([,]和空格到新行)并将其写入我的新txt文件
import csv
to_file =open("t_put.txt","w")
with open("t_put_val.20181026052328.csv", "r") as f:
for row in (list(csv.reader(f))):
value2= (" ".join(row)[1:-1]) #remove 3 first and last elements
value = value2.replace(" ","\n")# replace spaces with newline
value3 = value.replace("]["," ") # replace ][
value4 = value3.replace(" ","\n")
print(value4)
# st = str(s)
to_file.write(value4)#write to file
to_file.close()
使用此代码,我可以删除字符,但仍会出现重复项。我正在考虑使用 set() 方法,但它没有按预期工作或只是打印出最后四个数字,但可能不适用于更大的数据集
解决方案
通过按 ']' 分割,您可以对 csv 中的每个列表进行分组。
# Open up the csv file
with open("t_put_val.20181026052328.csv", "r") as f_h:
rows = [row.lstrip('[').split(", ")
# For each line in the file (there's just one)
for line in f_h.readlines()
# Dont' want a blank line
if not len(line) == 0
# Split the line by trailing ']'s
for row in line.split(']')
# Don't want the last blank list
if not len(row) == 0
]
# Print out all unique values
unique_values = set(item for row in rows for item in row)
[print(value) for value in unique_values];
# Output
with open("t_put.txt", 'w') as f_h:
f_h.writelines('%s\n' % ', '.join(row) for row in rows)
推荐阅读
- javascript - Javascript 无法识别 require() 和导出
- sharepoint-online - Microsoft Teams:在自定义选项卡中为 Sharepoint 文档提供“在 Teams 中编辑”
- scala - 如何查询flink的可查询状态
- java - 可被所有小于 X 的数整除的最小可能数的最优算法
- php - 如何对数据库结果列表中的某些行进行颜色编码?
- c - 如何传递和返回指向结构数组的指针?
- spring - Spring Boot Actuator httptrace 如果作为查询参数发送,则以纯文本形式显示用户凭据
- android - 我对回收站视图动画有疑问
- python - 在Python中从PDF中提取单词列表
- python - LinkedList - AttributeError:NoneType 对象下一个没有属性