首页 > 解决方案 > 根据另一个列表删除列表中的额外元素

问题描述

我有一个数据集,我试图将其分成两个列表。在每个列表中,它包含一个元素(例如,[[1]]列表对象中的[[2]], ),用于10 天间隔内[[3]]的单个元素(例如,第 1-10 天、第 11-21天和第22-31天)。ID[[1]][[2]][[3]]

在下面的示例代码中,列表对于jan每个具有三个间隔ID(例如,A对于三个间隔B具有三个元素,对于三个间隔具有三个元素,对于三个间隔C具有三个元素)。每个 的列表july只有 2 个间隔ID,这对我来说是个问题(例如,它只在列表对象中包含[[1]]and[[2]]而不是三个)。

我试图弄清楚如何删除janjuly. 例如,对于我想创建一个函数来比较两个列表,ID A并删除july. jan我该怎么做呢?

library(lubridate)
library(tidyverse)
date <- rep_len(seq(dmy("01-01-2010"), dmy("20-07-2010"), by = "days"), 600)
ID <- rep(c("A","B","C"), 200)

df <- data.frame(date = date,
                 x = runif(length(date), min = 60000, max = 80000),
                 y = runif(length(date), min = 800000, max = 900000),
                 ID)

df$month <- month(df$date)

jan <- df %>%
  mutate(new = floor_date(date, "10 days")) %>%
  group_by(ID) %>% 
  mutate(new = if_else(day(new) == 31, new - days(10), new)) %>% 
  group_by(new, .add = TRUE) %>%
  filter(month == "1") %>% 
  group_split()

july <- df %>%
  mutate(new = floor_date(date, "10 days")) %>%
  group_by(ID) %>% 
  mutate(new = if_else(day(new) == 31, new - days(10), new)) %>% 
  group_by(new, .add = TRUE) %>%
  filter(month == "7") %>% 
  group_split()

标签: rlistdplyrlubridate

解决方案


我仍然不确定你到底在追求什么。无论如何,这段代码可以满足您的要求。

df2 <- bind_rows(jan, july) %>%
  # adding a helper variable to distinguish if a day from the date component is
  # 10 or lower, 20 or lower or the rest 
  mutate(helper = ceiling(day(date)/10) %>% pmin(3)) %>% 
  group_by(ID, helper) %>%
  # adding another helper finding out how may distinct months there are in the subgroup
  mutate(helper2 = n_distinct(month)) %>% ungroup() %>%
  filter(helper2 == 2) %>%
  # getting rid of the helpers
  select(-helper, -helper2) %>%
  group_by(ID, new)

jan2 <- df2 %>%
  filter(month == "1") %>% 
  group_split()

推荐阅读