首页 > 解决方案 > R - 如何扩展列表中的每个项目以包括最早日期和今天之间的所有天

问题描述

我有一个数据框列表。每个项目都包含相同的列。我想向每个数据框添加行,以便它包含最小日期和今天之间的每一天。这是我的数据:

lst <- list(c1 = structure(list(clientid = "c1", date = structure(17323, class = "Date"), 
                                type = "enquiry"), row.names = 1L, class = "data.frame"), 
            c100002 = structure(list(clientid = c("c100002", "c100002", 
                                                  "c100002", "c100002", "c100002", "c100002", "c100002", "c100002", 
                                                  "c100002", "c100002", "c100002", "c100002", "c100002", "c100002", 
                                                  "c100002", "c100002", "c100002", "c100002"), date = structure(c(13451, 
                                                                                                                  14571, 14824, 14862, 14869, 15159, 15201, 15435, 15589, 15834, 
                                                                                                                  15877, 16245, 16279, 16609, 17015, 17055, 17130, 17843), class = "Date"), 
                                     type = c("enquiry", "enquiry", "booking", "enquiry", 
                                              "enquiry", "enquiry", "enquiry", "booking", "enquiry", 
                                              "enquiry", "booking", "enquiry", "booking", "booking", 
                                              "enquiry", "enquiry", "booking", "booking")), row.names = 2:19, class = "data.frame"), 
            c100009 = structure(list(clientid = "c100009", date = structure(13734, class = "Date"), 
                                     type = "booking"), row.names = 20L, class = "data.frame"))

看起来像...

> lst[1:3]
$`c1`
  clientid       date    type
1       c1 2017-06-06 enquiry

$c100002
   clientid       date    type
2   c100002 2006-10-30 enquiry
3   c100002 2009-11-23 enquiry
4   c100002 2010-08-03 booking
5   c100002 2010-09-10 enquiry
6   c100002 2010-09-17 enquiry
7   c100002 2011-07-04 enquiry
8   c100002 2011-08-15 enquiry
9   c100002 2012-04-05 booking
10  c100002 2012-09-06 enquiry
11  c100002 2013-05-09 enquiry
12  c100002 2013-06-21 booking
13  c100002 2014-06-24 enquiry
14  c100002 2014-07-28 booking
15  c100002 2015-06-23 booking
16  c100002 2016-08-02 enquiry
17  c100002 2016-09-11 enquiry
18  c100002 2016-11-25 booking
19  c100002 2018-11-08 booking

$c100009
   clientid       date    type
20  c100009 2007-08-09 booking

所以,基本上,我需要为列表中的每个数据框添加从每个日期到今天之间的所有日期的行。

'clientid' 列应该在每个新行上重复,但是对于不在原始数据中的任何行,'type' 列必须显示 NA。

我真的很感激任何帮助..

标签: r

解决方案


我们可以循环使用listwithmap和 usecomplete

library(tidyverse)
map(lst, ~ .x %>% 
             group_by(clientid, type) %>%
             complete(date = seq(min(date), Sys.Date(), by = '1 day')))

推荐阅读