r - 从 data.frames 列表创建数据框
问题描述
我有一个 data.frames 列表列表,我想将其转换为data.frame
. 结构如下:
l_of_lists <- list(
year1 = list(
one = data.frame(date = c("Jan-10", "Jan-22"), type = c("type 1", "type 2")),
two = data.frame(date = c("Feb-1", "Feb-28"), type = c("type 2", "type 3")),
three = data.frame(date = c("Mar-10", "Mar-15"), type = c("type 1", "type 4"))
),
year2 = list( # dates is used here on purpose, as the names don't perfectly match
one = data.frame(dates = c("Jan-22"), type = c("type 2"), another_col = c("entry 2")),
two = data.frame(date = c("Feb-10", "Feb-18"), type = c("type 2", "type 3"), another_col = c("entry 2", "entry 3")),
three = data.frame(date = c("Mar-10", "Mar-15"), type = c("type 1", "type 4"), another_col = c("entry 4", "entry 5"))
),
year3 = list( # this deliberately only contains two data frames
one = data.frame(date = c("Jan-10", "Jan-12"), type = c("type 1", "type 2")),
two = data.frame(date = c("Feb-8", "Jan-28"), type = c("type 2", "type 3"))
))
数据框有两个我试图在上面模仿的特性:
- 列名相差 1-2 个字符(例如
date
vs.dates
) - 某些列仅存在于某些数据框中(例如
another_col
)
我现在想将其转换为数据框(我尝试了对rbind
and的不同调用do.call
,如此处所述,但未成功)并希望 - 容忍地匹配列名(如果列名类似于 1-2 个字符,我希望它们匹配并 - 用其他列填充不存在NA
的列。
我想要一个类似于以下的数据框
year level date type another_col
1 one "Jan-10" "type 1" NA
1 one "Jan-22" "type 2" NA
1 two "Feb-1" "type 2" NA
1 two "Feb-28" "type 3" NA
1 three "Mar-10" "type 1" NA
1 three "Mar-15" "type 4" NA
2 one "Jan-22" "type 2" "entry 2"
2 two "Feb-1" "type 2" "entry 2"
2 two "Feb-28" "type 3" "entry 3"
2 three "Mar-10" "type 1" "entry 4"
2 three "Mar-15" "type 4" "entry 5"
3 one "Jan-10" "type 1" NA
3 one "Jan-12" "type 2" NA
3 two "Feb-8" "type 2" NA
3 two "Feb-28" "type 3" NA
有人可以指出rbind
这里是否是正确的路径 - 以及我缺少什么?
解决方案
您可以使用 purrr 和 dplyr 执行以下操作:
l_of_lists <- list(
year1 = list(
one = data.frame(date = c("Jan-10", "Jan-22"), type = c("type 1", "type 2")),
two = data.frame(date = c("Feb-1", "Feb-28"), type = c("type 2", "type 3")),
three = data.frame(date = c("Mar-10", "Mar-15"), type = c("type 1", "type 4"))
),
year2 = list( # dates is used here on purpose, as the names don't perfectly match
one = data.frame(dates = c("Jan-22"), type = c("type 2"), another_col = c("entry 2")),
two = data.frame(date = c("Feb-10", "Feb-18"), type = c("type 2", "type 3"), another_col = c("entry 2", "entry 3")),
three = data.frame(date = c("Mar-10", "Mar-15"), type = c("type 1", "type 4"), another_col = c("entry 4", "entry 5"))
),
year3 = list( # this deliberately only contains two data frames
one = data.frame(date = c("Jan-10", "Jan-12"), type = c("type 1", "type 2")),
two = data.frame(date = c("Feb-8", "Jan-28"), type = c("type 2", "type 3"))
))
# add libraries
library(dplyr)
library(purrr)
# Map bind_rows to each list within the list
l_of_lists %>%
map_dfr(~bind_rows(.x, .id = "level"), .id = "year")
这将产生:
year level date type dates another_col
1 year1 one Jan-10 type 1 <NA> <NA>
2 year1 one Jan-22 type 2 <NA> <NA>
3 year1 two Feb-1 type 2 <NA> <NA>
4 year1 two Feb-28 type 3 <NA> <NA>
5 year1 three Mar-10 type 1 <NA> <NA>
6 year1 three Mar-15 type 4 <NA> <NA>
7 year2 one <NA> type 2 Jan-22 entry 2
8 year2 two Feb-10 type 2 <NA> entry 2
9 year2 two Feb-18 type 3 <NA> entry 3
10 year2 three Mar-10 type 1 <NA> entry 4
11 year2 three Mar-15 type 4 <NA> entry 5
12 year3 one Jan-10 type 1 <NA> <NA>
13 year3 one Jan-12 type 2 <NA> <NA>
14 year3 two Feb-8 type 2 <NA> <NA>
15 year3 two Jan-28 type 3 <NA> <NA>
然后当然你可以做一些正则表达式解析只保留数字年份:
l_of_lists %>%
map_dfr(~bind_rows(.x, .id = "level"), .id = "year") %>%
mutate(year = substring(year, regexpr("\\d", year)))
如果您知道 date 和 dates 相同,则可以始终使用mutate
to 更改 then 为那些没有丢失的值(即mutate(date = ifelse(!is.na(date), date, dates))
)
推荐阅读
- java - 如何检查 InputStream 是否已被使用?
- pine-script - Pinescript 不能在警报时设置变量文本?
- jquery - IF 语句仅选择 jQuery 中不为空的标题
- php - 有什么方法可以跟踪仅来自特定页面后要访问的网页?
- google-apps-script - getRange.getValue 从 Google 表格返回错误的日期
- c - 如何释放队列[C语言]中的双指针?
- python - Python Change theColor of Column in a Dataframe then merge the Frame
- kubernetes - 将应用程序部署到 EKS 集群
- c# - 在 OnCreatingTicket 中读取 JSON 响应而不使用 MapJsonKey(声明)
- excel - 我的 excel 文件是德国(德语)格式,想更改为英语(英国)格式