首页 > 解决方案 > 在 R 中的特定日期绘制图表时出错

问题描述

你能帮我理解为什么我不能为 02/07 和 03/07 生成图表,但是对于 01/07,我可以吗?我在下面插入了可执行代码。

请注意,我dmda的就像 01/07 一样,并且生成了图表。当我插入dmda为 02/07 或 03/07 时,mod变量中出现错误,但我不明白为什么。

谢谢!

library(dplyr)
library(lubridate)
library(tidyverse)

df1 <- structure(
  list(date1 = c("2021-06-28","2021-06-28","2021-06-28","2021-06-28","2021-06-28",
                 "2021-06-28","2021-06-28","2021-06-28"),
       date2 = c("2021-04-02","2021-04-03","2021-04-08","2021-04-09","2021-04-10","2021-07-01","2021-07-02","2021-07-03"),
       Week= c("Friday","Saturday","Thursday","Friday","Saturday","Thursday","Friday","Monday"),
       DR01 = c(14,11,14,13,13,14,0,0), DR02= c(14,12,16,17,13,12,0,0),DR03= c(19,15,14,13,13,12,0,0),
       DR04 = c(15,14,13,13,16,12,13,0),DR05 = c(15,14,15,13,16,12,13,11),
       DR06 = c(21,14,13,13,15,16,13,11),DR07 = c(12,15,14,14,19,14,13,11)),
  class = "data.frame", row.names = c(NA, -8L))

dmda<-"2021-07-01"

datas<-df1 %>%
  filter(date2 == ymd(dmda)) %>%
  summarize(across(starts_with("DR"), sum)) %>%
  pivot_longer(everything(), names_pattern = "DR(.+)", values_to = "val") %>%
  mutate(name = as.numeric(name))
colnames(datas)<-c("Days","Numbers")

dif <- as.Date(dmda) - as.Date(df1$date1[1]) + 1
datas <- datas[dif:max(datas$Days, na.rm = TRUE),]

plot(Numbers ~ Days, xlim=c(0,8), ylim=c(0,20), data = datas,xaxs='i')
mod <- nls(Numbers ~ b1*Days^2+b2,start = list(b1 = 0,b2 = 0), data = datas)
new.data <- data.frame(Days = with(datas, seq(min(Days),max(Days),len = 45)))
new.data <- rbind(0, new.data)
lines(new.data$Days, predict(mod, newdata=new.data))
points(0, coef(mod)[2], col="red", pch=19, cex=1.2, xpd=TRUE)

在此处输入图像描述

标签: r

解决方案


让我们试着面对你的问题。让我们做一些不同的事情。
首先,让我们加载我们需要的库并加载您的数据。

library(lubridate)
library(tidyverse)
df1 <- structure(
  list(date1 = c("2021-06-28","2021-06-28","2021-06-28","2021-06-28","2021-06-28",
                 "2021-06-28","2021-06-28","2021-06-28"),
       date2 = c("2021-04-02","2021-04-03","2021-04-08","2021-04-09","2021-04-10","2021-07-01","2021-07-02","2021-07-03"),
       Week= c("Friday","Saturday","Thursday","Friday","Saturday","Thursday","Friday","Monday"),
       DR01 = c(14,11,14,13,13,14,0,0), DR02= c(14,12,16,17,13,12,0,0),DR03= c(19,15,14,13,13,12,0,0),
       DR04 = c(15,14,13,13,16,12,13,0),DR05 = c(15,14,15,13,16,12,13,11),
       DR06 = c(21,14,13,13,15,16,13,11),DR07 = c(12,15,14,14,19,14,13,11)),
  class = "data.frame", row.names = c(NA, -8L))

现在我们要对它们进行变异,但与你做的有点不同。

fDays = function(data, date2){
  firstDay = interval(data$date1[1], date2) %>% as.duration() %/% ddays(1)
  data %>% pivot_longer(starts_with("DR"), values_to = "Numbers") %>% 
    mutate(Days = 1:nrow(.)) %>% 
    filter(Days>firstDay) %>% 
    select(Days, Numbers)
}

df2 = df1 %>% as_tibble() %>% 
  mutate(
    date1 = date1 %>% ymd(),
    date2 = date2 %>% ymd()    
  ) %>% group_by(date2) %>% 
  nest() %>% 
  group_modify(~fDays(.x$data[[1]], .y$date2)) %>% 
  nest()

这需要一些评论。首先,请注意我将您的变量转换date1date2POSIXct 类型 ( date1 = date1%>% ymd())。nest然后我用函数折叠数据。
在这个操作之后,我们有这样的东西:

# A tibble: 8 x 2
# Groups:   date2 [8]
  date2      data            
  <date>     <list>          
1 2021-04-02 <tibble [1 x 9]>
2 2021-04-03 <tibble [1 x 9]>
3 2021-04-08 <tibble [1 x 9]>
4 2021-04-09 <tibble [1 x 9]>
5 2021-04-10 <tibble [1 x 9]>
6 2021-07-01 <tibble [1 x 9]>
7 2021-07-02 <tibble [1 x 9]>
8 2021-07-03 <tibble [1 x 9]>

date“2021 年 7 月 21 日”观察中的变量值如下所示:

# A tibble: 1 x 9
  date1      Week      DR01  DR02  DR03  DR04  DR05  DR06  DR07
  <date>     <chr>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2021-06-28 Thursday    14    12    12    12    12    16    14

然后我们使用函数fDays。但是,这里我们需要考虑更长的时间函数的逻辑。
正如我在你的程序中注意到的那样,你数了dif。在我的程序中是firstDay+1. 问题是对于值date1="2021-06-28"和对于date2="2021-04-02"这个值是-86. 不幸的是,我不知道在这种情况下应该怎么做。我只是假设对于负值,fDays将返回所有酿造日及其数字。
所以最后我的df2小标题看起来像这样:

# A tibble: 8 x 2
# Groups:   date2 [8]
  date2      data            
  <date>     <list>          
1 2021-04-02 <tibble [7 x 2]>
2 2021-04-03 <tibble [7 x 2]>
3 2021-04-08 <tibble [7 x 2]>
4 2021-04-09 <tibble [7 x 2]>
5 2021-04-10 <tibble [7 x 2]>
6 2021-07-01 <tibble [4 x 2]>
7 2021-07-02 <tibble [3 x 2]>
8 2021-07-03 <tibble [2 x 2]>

fordate"2021-04-02"样子

# A tibble: 7 x 2
   Days Numbers
  <int>   <dbl>
1     1      14
2     2      14
3     3      19
4     4      15
5     5      15
6     6      21
7     7      12

因为"2021-07-03"它看起来像这样

# A tibble: 3 x 2
   Days Numbers
  <int>   <dbl>
1     5      13
2     6      13
3     7      13

现在我们要做的就是创建图表。但是在这里我们又遇到了问题。nls首先,我们需要问,为 2 或 3 个观测建立模型是否有意义?第二,我们能把预测线画到 0 Days 点吗?我假设您考虑过这些问题的答案。
所以是时候准备情节了。

fPlot = function(data, date2){
  p = data %>% ggplot(aes(Days, Numbers))+
    geom_point()+
    xlim(0, max(data$Days))+
    ggtitle(date2)
  tryCatch(
    {
      mod <- nls(Numbers ~ b1*Days^2+b2, data, list(b1=0,b2=0))
      df = tibble(
        Days = c(0,seq(min(data$Days), max(data$Days), length.out = 45)),
        Numbers = predict(mod, newdata=tibble(Days=Days))
      )
      p + geom_line(data=df)+
        geom_point(data=df[1,], size = 3, color = "red")
    }, error = function(msg){p}
  )
}

df2 %>% group_map(~fPlot(.x$data[[1]], .y$date2))

但是请注意,由于特定值或少量观察,该nls函数可能会返回错误。因此,它被称为tryCatch
发生错误时,绘图将仅包含点。当一切顺利时,图表上会出现一条预测线。
以下是一些选定的地块 在此处输入图像描述 在此处输入图像描述 在此处输入图像描述 在此处输入图像描述


推荐阅读