首页 > 解决方案 > 检查连续日期内的两个值是否相同

问题描述

假设我有一个像

df <- tribble(
  ~date,       ~place, ~wthr,
  #------------/-----/--------
  "2017-05-06","NY","sun",
  "2017-05-06","CA","cloud",
  "2017-05-07","NY","sun",
  "2017-05-07","CA","rain",
  "2017-05-08","NY","cloud",
  "2017-05-08","CA","rain",
  "2017-05-09","NY","cloud",
  "2017-05-09","CA",NA,
  "2017-05-10","NY","cloud",
  "2017-05-10","CA","rain"
)

我想检查特定日期特定区域的天气是否与昨天相同,并将布尔列附加到df,以便

tribble(
  ~date,       ~place, ~wthr, ~same,
  #------------/-----/------/------
  "2017-05-06","NY","sun",    NA,
  "2017-05-06","CA","cloud",  NA, 
  "2017-05-07","NY","sun",    TRUE,
  "2017-05-07","CA","rain",   FALSE,
  "2017-05-08","NY","cloud",  FALSE,
  "2017-05-08","CA","rain",   TRUE,
  "2017-05-09","NY","cloud",  TRUE,
  "2017-05-09","CA", NA,      NA,
  "2017-05-10","NY","cloud",  TRUE,
  "2017-05-10","CA","rain",   NA
)

有没有好的方法来做到这一点?

标签: rdataframedplyrlubridatetibble

解决方案


要获得逻辑列,请在分组后wthr使用之前检查值是否等于行。我添加了日期以确保按时间顺序排列。lagplacearrange

library(dplyr)

df %>%
  arrange(date) %>%
  group_by(place) %>%
  mutate(same = wthr == lag(wthr, default = NA))

编辑:如果您想确保日期是连续的(相隔 1 天),您可以包含一个以查看和ifelse之间的差异是否为 1 。如果不是相隔 1 天,则可以编码为.datelag(date)NA

注意:另外,请确保您的日期是Date

df$date <- as.Date(df$date)

df %>%
  arrange(date) %>%
  group_by(place) %>%
  mutate(same = ifelse(
    date - lag(date) == 1, 
    wthr == lag(wthr, default = NA),
    NA))

输出

   date       place wthr  same 
   <chr>      <chr> <chr> <lgl>
 1 2017-05-06 NY    sun   NA   
 2 2017-05-06 CA    cloud NA   
 3 2017-05-07 NY    sun   TRUE 
 4 2017-05-07 CA    rain  FALSE
 5 2017-05-08 NY    cloud FALSE
 6 2017-05-08 CA    rain  TRUE 
 7 2017-05-09 NY    cloud TRUE 
 8 2017-05-09 CA    NA    NA   
 9 2017-05-10 NY    cloud TRUE 
10 2017-05-10 CA    rain  NA   

推荐阅读