首页 > 解决方案 > r中堆叠条形图的滚动平均趋势线

问题描述

我正在尝试复制 nytimes covid case barplot,但我想让它成为堆叠的 barplot。我的问题是 7 天滚动平均趋势线被我的堆叠变量“状态”弄乱了。当然,这种可视化并不理想,但现在我无法弄清楚它让我发疯。如果您不按状态分组并删除“color=states”,它可以正常工作,如下图所示。

library(dplyr)
library(readr)
library(ggplot2)
library(zoo)

data_url <- "http://covidtracking.com/api/states/daily.csv"
corona <- read_csv(data_url)
corona <- corona %>% 
  mutate(date=lubridate::parse_date_time(date, "ymd"))

total<-corona %>%
  group_by(date,state)%>%
summarise_at(vars(positiveIncrease),sum)%>%mutate(seven_avg= rollmean(positiveIncrease, 7,
                             align="left", 
                             fill=0))
  ggplot(total,aes(x=date,
             y=positiveIncrease,fill=state)) +
  geom_col()+
  geom_line(aes(y = seven_avg), 
            color = "red", 
            size = .75)

在此处输入图像描述

在此处输入图像描述

标签: rggplot2moving-average

解决方案


问题是它geom_line不会为您汇总您的数据。相反,您会得到一条连接州级所有观察结果的趋势线,而不是一条整体的聚合趋势线。

简单的解决方案是使用聚合数据集来获取聚合趋势线:

library(dplyr)
library(readr)
library(ggplot2)
library(zoo)

data_url <- "http://covidtracking.com/api/states/daily.csv"
corona <- read_csv(data_url)
corona <- corona %>%
  mutate(date = lubridate::parse_date_time(date, "ymd"))

total <- corona %>%
  group_by(date, state) %>%
  summarise_at(vars(positiveIncrease), sum) %>%
  mutate(seven_avg = rollmean(positiveIncrease, 7,
    align = "left",
    fill = 0
  ))

overall <- total %>%
  group_by(date) %>%
  summarise_at(vars(positiveIncrease), sum) %>%
  mutate(seven_avg = rollmean(positiveIncrease, 7,
                              align = "left",
                              fill = 0
  ))

ggplot(total, aes(
  x = date,
  y = positiveIncrease
)) +
  geom_col(aes(color = state)) +
  geom_line(data = overall, aes(y = seven_avg),
    color = "red",
    size = .75,
  )


推荐阅读