首页 > 解决方案 > 基于多地分层预警系统,聚合每日数据,提供一份预警输出

问题描述

我目前正在使用一个大型数据集,该数据集在多个位置记录每日数据,我想总结每日数据,以便有一个输出给出当天的最大警告级别(红色/黄色/无类别)。

考虑以下设置:

location = c(rep("A", 4), rep("B", 4), rep("C", 4), rep("D",4) , rep("E", 4))
date = rep(c("19991230", "19991231", "20000101", "20000102"), 5)
warning = c("Red", "None", "None", "None", "Yellow", "None", "Red", "None", "Yellow", "Yellow", "None", "Yellow", "None", "None", "None", "None", "Yellow", "None", "None", "None")

data = data.frame(location, date, warning)

我正在尝试创建一个新列,如果在每个特定日期都没有出现警告,则显示“无”,如果出现一个或多个黄色警告(除非同一天出现一个或多个“红色”警告),则显示“黄色”,其中如果“红色”输出优先。

我考虑过按日期使用聚合,但我不确定要应用哪个函数。我还尝试在每个日期上进行循环以尝试和 !count "None" 警告以至少缩小范围但没有任何运气。也许我需要在日期上使用 ifelse 和 for 循环?以下尝试不佳:

aggregate(data, by=date, FUN)

或者

data <- data %>%
group_by(date) %>%
mutate(day_warning_type = case_when(
warning != "None" ~ TRUE, TRUE ~ FALSE
)) %>%
ungroup()

希望有人至少可以在正确的方向上帮助我,因为到目前为止我还没有取得太大进展,因为我正在努力了解如何使用字符变量。

标签: rstringdateaggregate

解决方案


你在正确的轨道上使用group_by. 创建按日期汇总的第二个数据集,然后将其合并回主数据集可能更简单。见下文

# Summarize each date based on number of Yellow/Red/None warnings
data_sum <- data %>%
  group_by(date) %>%
  summarize(
    day_warning_none = length(which(warning == "None")),
    day_warning_yellow = length(which(warning == "Yellow")),
    day_warning_red = length(which(warning == "Red"))
  ) %>%
  ungroup() %>%
  # Create a summary  measure
  mutate(
    day_warning = case_when(
      day_warning_red > 0 ~ "Red",
      day_warning_yellow > 0 ~ "Yellow",
      TRUE ~ "None"
    )
  )

head(data.sum)
  date     day_warning_none day_warning_yellow day_warning_red day_warning
  <fct>               <int>              <int>           <int> <chr>      
1 19991230                1                  3               1 Red        
2 19991231                4                  1               0 Yellow     
3 20000101                4                  0               1 Red        
4 20000102                4                  1               0 Yellow    

# Merge back in
data2 <- left_join(data, data_sum) %>%
  arrange(date)
head(data2, 10)

   location     date warning day_warning_none day_warning_yellow day_warning_red day_warning
1         A 19991230     Red                1                  3               1         Red
2         B 19991230  Yellow                1                  3               1         Red
3         C 19991230  Yellow                1                  3               1         Red
4         D 19991230    None                1                  3               1         Red
5         E 19991230  Yellow                1                  3               1         Red
6         A 19991231    None                4                  1               0      Yellow
7         B 19991231    None                4                  1               0      Yellow
8         C 19991231  Yellow                4                  1               0      Yellow
9         D 19991231    None                4                  1               0      Yellow
10        E 19991231    None                4                  1               0      Yellow

推荐阅读