首页 > 解决方案 > 在 R 中读取 excel 文件:列标签和数据/小时格式的问题

问题描述

我有一个excel这样的文件:

在此处输入图像描述

我试图通过以下方式阅读:

library(xlsx)
df <- read.xlsx("2021.xlsx", sheetIndex = 1)

但是,我得到了一个我不太喜欢的结果

> dput(df)
structure(list(Twitter = structure(c(3L, 1L, 1L, 2L, 2L), .Label = c("Jack", 
"John", "User"), class = "factor"), NA. = structure(c(5L, 1L, 
3L, 4L, 2L), .Label = c("Hello world", "Hello!", "I'm a text", 
"I'm an example", "Tweet"), class = "factor"), NA..1 = structure(c(3L, 
1L, 1L, 2L, 2L), .Label = c("44293", "44294", "Date"), class = "factor"), 
NA..2 = structure(c(3L, 1L, 1L, 2L, 2L), .Label = c("0.490277777777778", 
"0.552083333333333", "Hour"), class = "factor"), NA..3 = structure(c(3L, 
1L, 1L, 2L, 2L), .Label = c("3", "4", "x"), class = "factor"), 
NA..4 = structure(c(3L, 2L, 2L, 1L, 1L), .Label = c("6", 
"7", "y"), class = "factor"), NA..5 = structure(c(3L, 2L, 
2L, 1L, 2L), .Label = c("no", "yes", "z"), class = "factor")), class = "data.frame", row.names = 
c(NA, -5L))

IE,

> df
  Twitter            NA. NA..1             NA..2 NA..3 NA..4 NA..5
1    User          Tweet  Date              Hour     x     y     z
2    Jack    Hello world 44293 0.490277777777778     3     7   yes
3    Jack     I'm a text 44293 0.490277777777778     3     7   yes
4    John I'm an example 44294 0.552083333333333     4     6    no
5    John         Hello! 44294 0.552083333333333     4     6   yes

这不是预期的结果。首先,日期和时间是错误的。其次,列的标签很奇怪(Twitter、Na.、NA..1 等)。相反,正确的标签位于数据帧的第一个 rwo 中。我想获得如下标签:

Twitter.User, Twitter.Tweet, Twitter.Date, Twitter.Hour, Twitter.x, Twitter.y, Twitter.z

标签: rexceldataframe

解决方案


尝试read.xlsx("2021.xlsx", sheetIndex = 1, startRow = 2)


推荐阅读