scala - Scala how to format scrambled data in proper order
问题描述
I have my data in text file which looks as follows
a,b,c,"d
ee"
1,2,3,"fo
ur"
p,o,t,"lu
ck"
o,n,e,"m
o
re"
I want to clean my data in such a way that my final output should be as follows:
a,b,c,"dee"
1,2,3,"four"
p,o,t,"luck"
o,n,e,"more"
This what I tried but I cant get what I was expecting:
val clean = Source.fromFile("my/path/csv/file.csv")
.getLines
.drop(1)
.mkString
.split("\"")
.array
Can someone help me how to do this?
解决方案
If your file is not too big:
Source.fromFile("my/path/csv/file.csv")
.mkString // Iterator[String] to String
.init // Remove the last " as we're gooing to split on \"\n and the last one won't be removed
.split("\"\n") // "a,b,c,\"d\nee\"\n1,2,3,\"fo becomes Array("a,b,c,\"d\nee", "1,2,3,\"fo")
.map(_.replace("\n", "") + "\"") // and we remove those wrongly placed \n
推荐阅读
- docker - 如何从名为卷的 docker 共享子文件夹?
- python - python 正则表达式提取图像url的最佳方法
- python - 如何将多个变量添加到标签文本?
- emacs - Emacs 无法创建 org-gcal-token 文件
- tensorflow - 在 tensorflow 急切执行中为优化器设置变量
- android - 如何在 andorid 中拥有两个用于调试和生产的 Firebase (Firestore) 数据库?
- python - 似乎这里的“中断”不起作用
- python - 将访问令牌存储在用户的集合中更好还是分开?
- github - Github Pages 不像本地机器那样显示页面
- bar-chart - 不要在 Power BI 的条形图中显示值的总和