首页 > 解决方案 > R中两个变量相关的计算

问题描述

我想在某个问题上得到一些帮助,但我将举一个例子,因为我相信它更容易解释。

总体思路是根据我选择的日期制作一张不同的表格。在表中将添加来自 PV 变量的数据和来自我的数据库 df 的 DR 数据。我将在下面更好地解释它:

如您所见,我有date1并且date2在我的数据库中。Date1 将始终保持不变,在本例中为 28/06。例如,在date228/06 之后的日子里,我们有 01/07(星期四)、02/07(星期六)和 03/07(星期一)。

我的想法是制作一个函数或类似的东西,例如,如果我选择 01/07,它将进行以下计算:01/07 – 28/06 = 3 天。我将从我的 df 数据库中的 DR 量中减去这个值,在本例中为 7,即 7 – 3 = 4。这个 4 指的是我将为我的表考虑的 DR 量。这些 DR 将始终是向后的,即它以 DR06 而不是 DR01 开头。每个日期的示例将是这些值:

所以 01/07 的表格将是: 在此处输入图像描述

如果是 02/07 日,则为 02/07 – 28/06 = 4 天。这个值是从我的数据库中的 DR 数量中减去的,即 7 – 4 = 3。因此,3 将是我将为我的表考虑的 DR 数量。 在此处输入图像描述

如果是 03/07 日,则为 03/07 – 28/06 = 5 天。这个值是从 DR 的数量中减去的,即 7 – 5 = 2。所以 2 将是我要为我的表考虑的 DR 数量 在此处输入图像描述

非常感谢!

下面的代码:

library(dplyr)

df <- structure(
  list(Id=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1),
       date1 = c("2021-06-28","2021-06-28","2021-06-28","2021-06-28","2021-06-28",
                 "2021-06-28","2021-06-28","2021-06-28","2021-06-28","2021-06-28","2021-06-28",
                 "2021-06-28","2021-06-28","2021-06-28","2021-06-28","2021-06-28","2021-06-28",
                 "2021-06-28","2021-06-28","2021-06-28","2021-06-28","2021-06-28","2021-06-28",
                 "2021-06-28","2021-06-28","2021-06-28"),
       date2 = c("2021-07-01","2021-07-01","2021-07-01","2021-07-01","2021-04-02",
                 "2021-04-02","2021-04-02","2021-04-02","2021-04-02","2021-04-02","2021-04-03",
                 "2021-04-03","2021-04-03","2021-04-03","2021-04-03","2021-04-08","2021-04-08",
                 "2021-04-09","2021-04-09","2021-04-10","2021-04-10","2021-07-02","2021-07-02",
                 "2021-07-02","2021-07-03","2021-07-03"),
       Week= c("Thursday","Thursday","Thursday","Thursday","Friday","Friday","Friday","Friday",
               "Friday","Friday","Saturday","Saturday","Saturday","Saturday","Saturday","Thursday",
               "Thursday","Friday","Friday","Friday","Friday","Friday","Friday","Friday","Monday",
               "Monday"),
       DTPE = c("Ho","Ho","Ho","Ho","","","","","","","","","","","","","","","","Ho","Ho","","","","",""),
       D1 = c(8,1,9, 3,5,4,7,6,3,8,2,3,4,6,7,8,4,2,6,2,3,4,3,2,4,8), DR01 = c(4,1,4,3,3,4,3,6,3,7,2,3,4,6,7,8,4,2,6,7,3,3,4,5,6,4),
       DR02= c(4,2,6,7,3,2,7,4,2,1,2,3,4,6,7,8,4,2,6,4,3,4,3,2,4,3),DR03 = c(9,5,4,3,3,2,1,5,3,7,2,3,4,7,7,8,4,2,6,4,3,4,3,4,5,4),
       DR04 = c(5,4,3,3,6,2,1,9,3,7,2,3,4,7,7,8,4,2,6,4,3,3,4,5,3,4),DR05 = c(5,4,5,3,6,2,1,9,3,7,5,3,4,3,4,5,6,2,6,4,3,3,4,5,3,4),
       DR06 = c(2,4,3,3,5,6,7,8,3,7,2,3,4,7,7,8,4,2,6,4,3,3,4,5,3,4),DR07 = c(2,5,4,4,9,4,7,8,3,7,2,3,4,7,7,8,4,2,6,4,3,3,4,5,3,4)),
  class = "data.frame", row.names = c(NA, -26L))


df<-df %>%
  group_by(date2, Week) %>%
  select(D1:DR07) %>%
  summarise_all(sum)

df<-data.frame(df)  
df
       date2     Week D1 DR01 DR02 DR03 DR04 DR05 DR06 DR07
1 2021-04-02   Friday 33   26   19   21   28   28   36   38
2 2021-04-03 Saturday 22   22   22   23   23   19   23   23
3 2021-04-08 Thursday 12   12   12   12   12   11   12   12
4 2021-04-09   Friday  8    8    8    8    8    8    8    8
5 2021-04-10   Friday  5   10    7    7    7    7    7    7
6 2021-07-01 Thursday 21   12   19   21   15   17   12   15
7 2021-07-02   Friday  9   12    9   11   12   12   12   12
8 2021-07-03   Monday 12   10    7    9    7    7    7    7

x<-subset(df, select = DR01:DR07)
x<-cbind(df, setNames(df$D1 - x, paste0(names(x), "_PV")))
PV<-select(x, date2,Week, D1, ends_with("PV"))

PV<-PV %>%
  group_by(Week) %>%
  summarize(across(ends_with("PV"), median))

PV
# A tibble: 4 x 8
  Week     DR01_PV DR02_PV DR03_PV DR04_PV DR05_PV DR06_PV DR07_PV
  <chr>      <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
1 Friday      -1.5       0      -1      -1    -1      -2.5    -2.5
2 Monday       2         5       3       5     5       5       5  
3 Saturday     0         0      -1      -1     3      -1      -1  
4 Thursday     4.5       1       0       3     2.5     4.5     3  

标签: r

解决方案


df我认为如果你重塑这两个表并PV形成更长的形式,然后加入它们,这会更容易做到。这样,您将在自己的行中拥有每对相应的日列,您只需添加这两个值,然后再重新调整宽度。

library(tidyverse)
df %>%
  pivot_longer(-c(date2:Week)) %>%
  left_join(PV %>% rename_with( ~str_remove(., "_PV")) %>% 
              pivot_longer(-Week, values_to = "PV")) %>%
  mutate(new_value = value + PV) %>%
  select(-c(value:PV)) %>%
  pivot_wider(names_from = name, values_from = new_value)

结果

  date2      Week        D1  DR01  DR02  DR03  DR04  DR05  DR06  DR07
  <chr>      <chr>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2021-04-02 Friday      NA  24.5    19    20    27  27    33.5  35.5
2 2021-04-03 Saturday    NA  22      22    22    22  22    22    22  
3 2021-04-08 Thursday    NA  16.5    13    12    15  13.5  16.5  15  
4 2021-04-09 Friday      NA   6.5     8     7     7   7     5.5   5.5
5 2021-04-10 Friday      NA   8.5     7     6     6   6     4.5   4.5
6 2021-07-01 Thursday    NA  16.5    20    21    18  19.5  16.5  18  
7 2021-07-02 Friday      NA  10.5     9    10    11  11     9.5   9.5
8 2021-07-03 Monday      NA  12      12    12    12  12    12    12  

推荐阅读