首页 > 解决方案 > 从不同的行中减去值

问题描述

我一直在尝试将我的计算从 excel 转移到 R,只是想知道是否有一种方法可以复制 IF(有一个步骤)。

所以我的数据在下面,我用简单的公式在 excel 中得到的结果在 DIFF 列 ( =IF(A2=A3, (C2-B3) * 24, 0))

NO  T_DATE              L_DATE              DIFF
AAA 10/08/2019 17:02:00 10/08/2019 20:35:00 5.83
AAA 10/08/2019 14:45:00 10/08/2019 15:10:00 11.78
AAA 10/08/2019 03:23:00 10/08/2019 10:25:00 17.32
AAA 09/08/2019 17:06:00 10/08/2019 01:11:00 25.70
AAA 08/08/2019 23:29:00 09/08/2019 10:27:00 0
BBB 08/08/2019 09:34:00 08/08/2019 21:19:00 22.23
BBB 07/08/2019 23:05:00 08/08/2019 06:09:00 18.03
BBB 07/08/2019 12:07:00 07/08/2019 20:25:00 22.32
BBB 06/08/2019 22:06:00 07/08/2019 08:53:00 22.77
BBB 06/08/2019 10:07:00 06/08/2019 19:44:00 0

我一直在尝试 R,但运气不佳。获取数据框的代码如下:

library(data.table)
library(lubridate)

NO <- c("AAA", "AAA", "AAA", "AAA", "AAA", "BBB", "BBB", "BBB", "BBB", "BBB")
T_DATE <- c( "10/08/2019 17:02:00",  "10/08/2019 14:45:00", "10/08/2019 03:23:00",  "09/08/2019 17:06:00", "08/08/2019 23:29:00",  "08/08/2019 09:34:00", "07/08/2019 23:05:00", "07/08/2019 12:07:00", "06/08/2019 22:06:00", "06/08/2019 10:07:00")

L_DATE <- c( "10/08/2019 20:35:00", "10/08/2019 15:10:00","10/08/2019 10:25:00", "10/08/2019 01:11:00","09/08/2019 10:27:00", "08/08/2019 21:19:00","08/08/2019 06:09:00","07/08/2019 20:25:00", "07/08/2019 08:53:00", "06/08/2019 19:44:00")

df <- data.frame(NO, T_DATE, L_DATE)

rm(DIFF,L_DATE,NO,T_DATE)

我不知道如何添加使用第一行的 L_DATE 和随后的 T_Date(第 2 行)的步骤,所以第一个计算是10/08/2019 20:35:00 - 10/08/2019 14:45:00两个 NO 是否相同。

标签: r

解决方案


抱歉,将您的公式误读为具有C2-C3,而不是C2-B3。下面稍作修正以纠正这一点。

library(dplyr)
library(lubridate)

df <- data.frame(
  NO = c("AAA", "AAA", "AAA", "AAA", "AAA", "BBB", "BBB", "BBB", "BBB", "BBB"),
  T_DATE = dmy_hms(c( "10/08/2019 17:02:00",  "10/08/2019 14:45:00", "10/08/2019 03:23:00",  "09/08/2019 17:06:00", "08/08/2019 23:29:00",  "08/08/2019 09:34:00", "07/08/2019 23:05:00", "07/08/2019 12:07:00", "06/08/2019 22:06:00", "06/08/2019 10:07:00")),
  L_DATE = dmy_hms(c( "10/08/2019 20:35:00", "10/08/2019 15:10:00","10/08/2019 10:25:00", "10/08/2019 01:11:00","09/08/2019 10:27:00", "08/08/2019 21:19:00","08/08/2019 06:09:00","07/08/2019 20:25:00", "07/08/2019 08:53:00", "06/08/2019 19:44:00"))
)

df %>% 
  group_by(NO) %>% 
  mutate(DIFF = difftime(L_DATE, lead(T_DATE), units = "hours"))
#> # A tibble: 10 x 4
#> # Groups:   NO [2]
#>    NO    T_DATE              L_DATE              DIFF           
#>    <fct> <dttm>              <dttm>              <drtn>         
#>  1 AAA   2019-08-10 17:02:00 2019-08-10 20:35:00  5.833333 hours
#>  2 AAA   2019-08-10 14:45:00 2019-08-10 15:10:00 11.783333 hours
#>  3 AAA   2019-08-10 03:23:00 2019-08-10 10:25:00 17.316667 hours
#>  4 AAA   2019-08-09 17:06:00 2019-08-10 01:11:00 25.700000 hours
#>  5 AAA   2019-08-08 23:29:00 2019-08-09 10:27:00        NA hours
#>  6 BBB   2019-08-08 09:34:00 2019-08-08 21:19:00 22.233333 hours
#>  7 BBB   2019-08-07 23:05:00 2019-08-08 06:09:00 18.033333 hours
#>  8 BBB   2019-08-07 12:07:00 2019-08-07 20:25:00 22.316667 hours
#>  9 BBB   2019-08-06 22:06:00 2019-08-07 08:53:00 22.766667 hours
#> 10 BBB   2019-08-06 10:07:00 2019-08-06 19:44:00        NA hours

推荐阅读