r - 某个日期的出现次数超过 x 次,获取下一个可用日期
问题描述
我有一个包含 15 列的数据框,其中 1 列是参与者 ID,14 列是每位参与者可能的约会日期(节假日和周末除外):
Included.Participant V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14
1 1 2021-03-22 2021-03-23 2021-03-24 2021-03-25 2021-03-26 <NA> <NA> 2021-03-29 2021-03-30 2021-03-31 2021-04-01 2021-04-02 <NA> <NA>
2 2 2021-03-22 2021-03-23 2021-03-24 2021-03-25 2021-03-26 <NA> <NA> 2021-03-29 2021-03-30 2021-03-31 2021-04-01 2021-04-02 <NA> <NA>
3 3 2021-03-22 2021-03-23 2021-03-24 2021-03-25 2021-03-26 <NA> <NA> 2021-03-29 2021-03-30 2021-03-31 2021-04-01 2021-04-02 <NA> <NA>
4 4 2021-03-22 2021-03-23 2021-03-24 2021-03-25 2021-03-26 <NA> <NA> 2021-03-29 2021-03-30 2021-03-31 2021-04-01 2021-04-02 <NA> <NA>
5 5 2021-03-22 2021-03-23 2021-03-24 2021-03-25 2021-03-26 <NA> <NA> 2021-03-29 2021-03-30 2021-03-31 2021-04-01 2021-04-02 <NA> <NA>
6 6 2021-03-22 2021-03-23 2021-03-24 2021-03-25 2021-03-26 <NA> <NA> 2021-03-29 2021-03-30 2021-03-31 2021-04-01 2021-04-02 <NA> <NA>
每个日期,可以添加 3 名参与者。如果已达到最多 3 个,则应移至第四个参与者的下一个可用日期。所以在这个例子中,期望的输出是:
Included.Participant V1
1 1 2021-03-22
2 2 2021-03-22
3 3 2021-03-22
4 4 2021-03-23
5 5 2021-03-23
6 6 2021-03-23
如果没有可能的日期,则 V1 列可以留空。
我似乎无法弄清楚如何获得所需的输出。我真的希望你能帮忙
非常感谢!
输入:
structure(list(Included.y = c(1L, 2L, 3L, 4L, 7L, 8L, 9L, 10L,
11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L),
V1 = structure(c(18870, NA, 18848, NA, NA, NA, NA, NA, 18806,
18799, 18835, 18841, NA, NA, 18912, 18954, NA, 18842, NA,
NA), class = "Date"), V2 = structure(c(18871, NA, 18849,
NA, NA, NA, NA, 18876, 18807, 18800, 18836, 18842, NA, NA,
18913, 18955, NA, 18843, NA, NA), class = "Date"), V3 = structure(c(18872,
18904, 18850, 18897, 18967, NA, 18883, 18877, 18808, 18801,
18837, 18843, 18890, NA, 18914, 18956, 18953, 18844, NA,
18869), class = "Date"), V4 = structure(c(NA, 18905, 18851,
18898, 18968, 18953, 18884, 18878, 18809, 18802, NA, 18844,
18891, 18967, NA, NA, 18954, NA, 18925, 18870), class = "Date"),
V5 = structure(c(NA, 18906, NA, 18899, 18969, 18954, 18885,
18879, NA, NA, NA, NA, 18892, 18968, NA, NA, 18955, NA, 18926,
18871), class = "Date"), V6 = structure(c(NA, 18907, NA,
18900, 18970, 18955, 18886, NA, NA, NA, NA, NA, 18893, 18969,
NA, NA, 18956, NA, 18927, 18872), class = "Date"), V7 = structure(c(18876,
NA, NA, NA, NA, 18956, NA, NA, NA, NA, 18841, NA, NA, 18970,
18918, 18960, NA, 18848, 18928, NA), class = "Date"), V8 = structure(c(18877,
NA, 18855, NA, NA, NA, NA, NA, 18813, 18806, 18842, 18848,
NA, NA, 18919, 18961, NA, 18849, NA, NA), class = "Date"),
V9 = structure(c(18878, NA, 18856, NA, NA, NA, NA, 18883,
18814, 18807, 18843, 18849, NA, NA, 18920, 18962, NA, 18850,
NA, NA), class = "Date"), V10 = structure(c(18879, 18911,
18857, 18904, 18974, NA, 18890, 18884, 18815, 18808, 18844,
18850, 18897, NA, 18921, 18963, 18960, 18851, NA, 18876), class = "Date"),
V11 = structure(c(NA, 18912, 18858, 18905, 18975, 18960,
18891, 18885, 18816, 18809, NA, 18851, 18898, 18974, NA,
NA, 18961, NA, 18932, 18877), class = "Date"), V12 = structure(c(NA,
18913, NA, 18906, 18976, 18961, 18892, 18886, NA, NA, NA,
NA, 18899, 18975, NA, NA, 18962, NA, 18933, 18878), class = "Date"),
V13 = structure(c(NA, 18914, NA, 18907, 18977, 18962, 18893,
NA, NA, NA, NA, NA, 18900, 18976, NA, NA, 18963, NA, 18934,
18879), class = "Date"), V14 = structure(c(18883, NA, NA,
NA, NA, 18963, NA, NA, NA, NA, 18848, NA, NA, 18977, 18925,
18967, NA, 18855, 18935, NA), class = "Date")), row.names = c(NA,
20L), class = "data.frame")
解决方案
在澄清参与者分配到不同列中的相同日期后更新了代码:
library(dplyr)
library(tidyr)
library(tibble)
df1 <-
df %>%
pivot_longer(-Included.Participant) %>%
select(-Included.Participant) %>%
mutate(name = factor(name, levels = paste0("V", 1:14), ordered = TRUE))%>%
group_by(value) %>%
arrange(value, name) %>%
slice_head(n = 3)%>%
rowid_to_column(var = "Included.Participant") %>%
filter(Included.Participant <= 20) %>%
pivot_wider(names_from = name, values_from = value)
- 输出
head(df1, 10)
#> # A tibble: 10 x 5
#> Included.Participant V1 V2 V3 V4
#> <int> <date> <date> <date> <date>
#> 1 1 2021-03-22 NA NA NA
#> 2 2 2021-03-22 NA NA NA
#> 3 3 2021-03-22 NA NA NA
#> 4 4 2021-03-23 NA NA NA
#> 5 5 2021-03-23 NA NA NA
#> 6 6 2021-03-23 NA NA NA
#> 7 7 NA 2021-03-24 NA NA
#> 8 8 NA 2021-03-24 NA NA
#> 9 9 NA 2021-03-24 NA NA
#> 10 10 NA NA 2021-03-25 NA
推荐阅读
- canvas - 如何在 gojs 画布中上传文档/ppt/图像?
- flask - Alembic 几个模块到单个数据库,具有特定的文件迁移
- python - 仅在从詹金斯运行时获取“OSError:异常:访问冲突写入 0x00000000”
- json - 使用 Lift-JSON 解析嵌套 JSON 值的问题
- javascript - HTTPInterceptor 未拦截来自 Angular 8 应用程序中第 3 方小部件的 http 请求
- webpack - 未捕获错误类型错误:无法使用“in”运算符在 next-dev.js:8 的未定义中搜索“电子”
- python - 为什么在尝试为 CNN 添加训练集时显示以下错误?
- python - 如何通过确定元素对嵌套列表进行排序?
- c# - 如何为 C# 传递experimental_allow_proto3_optional
在 proto3 中启用可选的定义? - php - PHP 警告:模块 'xxxxxxx' 已经加载到第 0 行的未知中,重复写入 error_log