首页 > 解决方案 > 为什么以下行生成一维数组而不是二维数组?

问题描述

我正在尝试读取 CSV 文件Julia 1.1,生成一个字符串矩阵,其布局与 csv 文件中的原始数据相同。换句话说,如果我的 CSV 文件是

a,s,d,f,g,h
q,w,e,r,t,y

我的矩阵应该看起来像

a s d f g h
q w e r t y

我不知道文件中有多少行,所以我尝试了以下操作:

csv_file_lines = readlines("./" * filename)
data = hcat( map( x -> split(x, ","), csv_file_lines ) )

在我较短的示例文件之一中,输出是

Array{SubString{String},1}[["date", "watch_time_minutes", "views", "average_view_duration", "video_thumbnail_impressions", "video_thumbnail_impressions_ctr"]; ["2019-03-04", "83.2051", "28", "2.9716", "318", "6.2893"]; ["2019-03-05", "43.6223", "12", "3.6352", "79", "10.1266"]; ["2019-03-06", "5.5267", "2", "2.7633", "33", "6.0606"]; ["2019-03-07", "0", "0", "0", "0", "0"]; ["2019-03-08", "58.7133", "11", "5.3376", "86", "8.1395"]; ["2019-03-09", "0", "0", "0", "0", "0"]; ["2019-03-10", "20.205", "4", "5.0512", "14", "7.1429"]; ["2019-03-11", "10.7013", "4", "2.6753", "24", "4.1667"]; ["2019-03-12", "1.3", "1", "1.3", "5", "20"]; ["2019-03-13", "0", "0", "0", "0", "0"]; ["2019-03-14", "14.7383", "6", "2.4564", "65", "9.2308"]; ["2019-03-15", "20.75", "7", "2.9643", "25", "12"]; ["2019-03-16", "31.0083", "4", "7.7521", "0", ""]; ["2019-03-17", "6.8624", "2", "3.4312", "0", ""]; ["2019-03-18", "0", "0", "0", "0", "0"]; ["2019-03-19", "0", "0", "0", "0", "0"]; ["2019-03-20", "0", "0", "0", "0", "0"]; ["2019-03-21", "0", "0", "0", "0", "0"]]

它是数组的一维数组Substring,而不是Strings(或Substring本例中的 s)的二维数组。我在这里做错了什么?更改hcatvcat无助于缓解此问题。

编辑

我宁愿在没有CSV包或数据帧的情况下这样做,以减少开销。

标签: arrayscsvmatrixjulia

解决方案


我认为readdlm标准库 DelimitedFiles 是您正在寻找的:

julia> using DelimitedFiles

julia> readdlm("file.csv", ',', String)
2×6 Array{String,2}:
 "a"  "s"  "d"  "f"  "g"  "h"
 "q"  "w"  "e"  "r"  "t"  "y"

这会产生一个2x6字符串矩阵。

检查?readdlm详情。


推荐阅读