首页 > 解决方案 > 在R中按组计算日期?

问题描述

我有类似的东西:

df<-data.frame(group=c(1, 1, 1, 1, 2, 2, 2 , 2, 2), 
               date=c("2001-01-01 00:00:00", "2001-01-01 00:00:00", "2001-01-04 07:07:40", "2001-01-04 07:07:40", "2001-01-09 00:00:00",
                               "2001-01-09 00:00:00", "2001-01-11 13:00:00", "2001-01-11 13:00:00", "2001-01-12 13:00:00"),
               want=c(1,1,2,2,1,1,2,2,3))

df<-df%>%mutate(date=as.POSIXct(date))

  group                date want
1     1 2001-01-01 00:00:00    1
2     1 2001-01-01 00:00:00    1
3     1 2001-01-04 07:07:40    2
4     1 2001-01-04 07:07:40    2
5     2 2001-01-09 00:00:00    1
6     2 2001-01-09 00:00:00    1
7     2 2001-01-11 13:00:00    2
8     2 2001-01-11 13:00:00    2
9     2 2001-01-12 13:00:00    3

我想按组顺序计算日期,但不想减少重复的行(即之前区分)

谢谢

标签: r

解决方案


我们可以match转换为之后使用Date

library(dplyr)
df %>% 
   group_by(group) %>% 
   mutate(want = match(as.Date(date),unique(as.Date(date))))
# A tibble: 9 x 3
# Groups:   group [2]
#  group date                 want
#  <dbl> <dttm>              <int>
#1     1 2001-01-01 00:00:00     1
#2     1 2001-01-01 00:00:00     1
#3     1 2001-01-04 07:07:40     2
#4     1 2001-01-04 07:07:40     2
#5     2 2001-01-09 00:00:00     1
#6     2 2001-01-09 00:00:00     1
#7     2 2001-01-11 13:00:00     2
#8     2 2001-01-11 13:00:00     2
#9     2 2001-01-12 13:00:00     3

或将其更改为factor并强制为integer

df %>%
  group_by(group) %>%
  mutate(want = as.integer(factor(as.Date(date))))

推荐阅读