首页 > 解决方案 > 情节比例变化ggplot

问题描述

我想datetime使用直方图绘制每天的列比例。例如,第一天有 6 个计数,第二天有 3 个,第三天有 7 个,我想绘制从第一天到第二天和第二天到第三天的变化比例 pr 百分比并对其余数据执行相同操作。

代码

data <- read.table("input.csv", sep=",", head=T)
data$datetime <- as.Date(data$datetime)
ggplot(data, aes(x=datetime)) +
  geom_histogram(binwidth=0.5, colour="black", fill="white")   +
  stat_bin(aes(y=..count..+1,
               label=ifelse(..count..!=0, ..count.., NA)), geom='text', binwidth = 0.5, size=3)+ 
  #scale_x_date(date_minor_breaks = "1 day")+
  scale_x_date(date_breaks = "1 day",  date_labels = "%b-%d-%y")+ 
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1, size=6))

输入.csv

index,datetime,value,type
461,2020-03-03 00:00:00,1.9942995846439968,x
462,2020-03-03 01:00:00,2.1268067887438273,x
463,2020-03-03 02:00:00,2.465004647476598,x
464,2020-03-03 04:00:00,2.6925364129228964,x
465,2020-03-03 10:00:00,2.9067051924252225,x
466,2020-03-03 23:00:00,3.15486048056035,x
467,2020-03-04 04:00:00,3.129483871690328,x
468,2020-03-04 05:00:00,2.9299302120270583,x
469,2020-03-04 07:00:00,2.8233925583949744,x
470,2020-03-05 02:00:00,2.7136509773224926,x
471,2020-03-05 03:00:00,2.414295826379634,x
472,2020-03-05 04:00:00,2.3617177577192523,x
473,2020-03-05 05:00:00,2.3603488433328494,x
474,2020-03-05 06:00:00,2.3820833128692214,x
475,2020-03-05 17:00:00,2.376124347303893,x
476,2020-03-05 18:00:00,2.4256585822020846,x
477,2020-03-06 03:00:00,2.363671952946105,x
478,2020-03-06 05:00:00,2.431267806961426,x
479,2020-03-06 06:00:00,2.5549387862153146,x
480,2020-03-06 07:00:00,2.607673788605378,x
481,2020-03-06 14:00:00,2.670112987652902,x
482,2020-03-06 16:00:00,2.9147875278302138,x

标签: rdatetimeggplot2

解决方案


在制作情节之前进行尽可能多的处理往往是最容易的。在这里,我计算每天的病例数和天之间的变化,然后绘制它。由于我预先计算了计数,我可以使用geom_col而不是geom_histogram.

library(tidyverse)
library(lubridate)

dat <- read_csv("index,datetime,value,type
461,2020-03-03 00:00:00,1.9942995846439968,x
462,2020-03-03 01:00:00,2.1268067887438273,x
463,2020-03-03 02:00:00,2.465004647476598,x
464,2020-03-03 04:00:00,2.6925364129228964,x
465,2020-03-03 10:00:00,2.9067051924252225,x
466,2020-03-03 23:00:00,3.15486048056035,x
467,2020-03-04 04:00:00,3.129483871690328,x
468,2020-03-04 05:00:00,2.9299302120270583,x
469,2020-03-04 07:00:00,2.8233925583949744,x
470,2020-03-05 02:00:00,2.7136509773224926,x
471,2020-03-05 03:00:00,2.414295826379634,x
472,2020-03-05 04:00:00,2.3617177577192523,x
473,2020-03-05 05:00:00,2.3603488433328494,x
474,2020-03-05 06:00:00,2.3820833128692214,x
475,2020-03-05 17:00:00,2.376124347303893,x
476,2020-03-05 18:00:00,2.4256585822020846,x
477,2020-03-06 03:00:00,2.363671952946105,x
478,2020-03-06 05:00:00,2.431267806961426,x
479,2020-03-06 06:00:00,2.5549387862153146,x
480,2020-03-06 07:00:00,2.607673788605378,x
481,2020-03-06 14:00:00,2.670112987652902,x
482,2020-03-06 16:00:00,2.9147875278302138,x")

dat2 <- dat %>% 
  mutate(date = as.Date(datetime)) %>% 
  group_by(date) %>% 
  summarise(n = n()) %>% 
  mutate(prop = n/lag(n))
#> `summarise()` ungrouping output (override with `.groups` argument)

ggplot(dat2, aes(x = date, y = n, label = round(prop, 2))) +
  geom_col()   +
  geom_text(nudge_y = 0.1) + 
  scale_x_date(date_breaks = "1 day",  date_labels = "%b-%d-%y")+ 
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1, size=6))
#> Warning: Removed 1 rows containing missing values (geom_text).

reprex 包(v0.3.0)于 2020-07-22 创建


推荐阅读