r - 如何将R中的多行与数据集中的多列合并
问题描述
我想根据列将每个行合并为 record_id
一行,但 type
列中的志愿者除外,该record_id
列中有两个重复项repeat
。我想要这些的第二行。每个 record_id 对应于一个人,该人要么参加过一次(重复=1)或两次测试,因此在repeat
列中有两个条目。
这是我的数据的样子
structure(list(record_id = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4,
4, 4, 4), type = c(NA, "data_collection", "test", NA, "data_collection",
"test", NA, "data_collection", "test", "test", NA, "cata_collection",
"test", "test"), `repeat` = c(NA, 1, 1, NA, 1, 1, NA, 1, 1, 2,
NA, 1, 1, 2), dt_volunteer_reg = structure(c(1597246320, NA,
NA, 1599217080, NA, NA, 1596184500, NA, NA, NA, 1598192280, NA,
NA, NA), class = c("POSIXct", "POSIXt"), tzone = "UTC"), age = c(26,
NA, NA, 64, NA, NA, 51, NA, NA, NA, 39, NA, NA, NA), gender = c(0,
NA, NA, 1, NA, NA, 0, NA, NA, NA, 1, NA, NA, NA), case_type = c(NA,
1, NA, NA, 2, NA, NA, 1, NA, NA, NA, 1, NA, NA), test_dis_dt = structure(c(NA,
NA, 1597250220, NA, NA, 1600012980, NA, NA, 1596382080, 1601980740,
NA, NA, 1598284020, 1603118700), class = c("POSIXct", "POSIXt"
), tzone = "UTC"), test_dis_res = c(NA, NA, 1, NA, NA, 1, NA,
NA, 2, 2, NA, NA, 2, 2), test_dis_in = c(NA, NA, NA, NA, NA,
0.02, NA, NA, 6.13, 4.75, NA, NA, 7.23, 3.85), test_cont_dt = structure(c(NA,
NA, 1597250280, NA, NA, 1608636120, NA, NA, NA, 1601980740, NA,
NA, 1605704940, 1603205340), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
test_cont_res = c(NA, NA, 2, NA, NA, 1, NA, NA, NA, 2, NA,
NA, 2, 2), test_cont_val = c(NA, NA, 123, NA, NA, 0, NA,
NA, NA, 40000, NA, NA, 471.6, 306.5)), row.names = c(NA,
-14L), class = c("tbl_df", "tbl", "data.frame"))
这就是我希望得到的
structure(list(record_id = c(1, 2, 3, 3, 4, 4), `repeat` = c(1,
1, 1, 2, 1, 2), dt_volunteer_reg = structure(c(1597246320, 1599217080,
1596184500, 1596184500, 1598192280, 1598192280), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), age = c(26, 64, 51, 51, 39, 39), gender = c(0,
1, 0, 0, 1, 1), case_type = c(1, 2, 1, 1, 1, 1), test_dis_dt = structure(c(1597250220,
1600012980, 1596382080, 1601980740, 1598284020, 1603118700), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), test_dis_res = c(1, 1, 2, 2, 2, 2),
test_dis_in = c(NA, 0.02, 6.13, 4.75, 7.23, 3.85), test_cont_dt = structure(c(1597250280,
1608636120, NA, 1601980740, 1605704940, 1603205340), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), test_cont_res = c(2, 1, NA, 2,
2, 2), test_cont_val = c(123, 0, NA, 40000, 471.6, 306.5)), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
解决方案
假设第一个数据帧被称为输入,并且您很高兴使用 tidyverse,您可以这样做。
input %>%
nest(data = c(-record_id)) %>%
mutate(
data = map(data, ~replace_na(., as.list(head(., 1)))), # Fill in speciment details
data = map(data, filter, !is.na(`repeat`)), # Remove speciment details
data = map(data, ~replace_na(., as.list(head(., 1)))), # Fill in test data with data collection details
data = map(data, filter, type == "test") # Remove data collection rows
) %>%
unnest(data) %>%
select(-type
有一些方法可以更简洁和/或更快地做到这一点,但这可能更具可读性。
推荐阅读
- python - 实时网络摄像头源上的 OCR:图像高度为零,IHDR 数据无效
- visual-studio - 导致错误的原因:不支持“NgenLocalized”参数
- uiviewcontroller - 为什么 NavigationLink 按钮在自定义 UIViewControllerRepresentable 包装器中显示为“禁用”
- sql - 得到错误不正确的语法我做错了什么
- mongodb - 重新启动我的服务 systemd 后无法访问 docker-compose 卷
- svn - 处理引用多个 SVN 存储库的项目
- php - Laravel 嵌套查询多重关系
- laravel - 使用 belongsTo 的关系用户和子数据
- azure - 如何为 vNet 网关客户端使用 privateIPAllocationMethod
- jquery - 页面加载时如何显示 Bootstrap toast?