首页 > 解决方案 > 逆向工程累积到每日数据?

问题描述

我有一个数据框,其中有日期数据和累积计数。我试图做一个反向的 cumsum来获得每日计数,但也得到每组的计数。我正在尝试从数据框 A 转到数据框 B。我正在使用 R 和tidyr.

这是代码:


df <- data.frame(cum_count = c(5, 14, 50, 5, 14, 50),
                 state = c("Alabama", "Alabama", "Alabama", "NY", "NY", "NY"),
                 Year = c(2012:2014, 2012:2014))

Dataframe A
  cum_count   state Year
1         5 Alabama 2012
2        14 Alabama 2013
3        50 Alabama 2014
4         5      NY 2012
5        14      NY 2013
6        50      NY 2014
Dataframe B
  cum_count   state Year
1         5 Alabama 2012
2         9 Alabama 2013
3        36 Alabama 2014
4         5      NY 2012
5         9      NY 2013
6        36      NY 2014

我尝试过使用 diff 函数:

df <- df %>%group_by(state)%>%
      mutate(daily_count = diff(cum_count))

但我明白了

错误:列daily_count的长度必须为 3(行数)或 1,而不是 2

让我知道你的想法。

谢谢!

标签: rdiffcumsumcumulative-sum

解决方案


也许你可以尝试diff,例如,

df <- df %>%group_by(state)%>%
  mutate(daily_count = c(cum_count[1],diff(cum_count)))

这样

> df
# A tibble: 6 x 4
# Groups:   state [2]
  cum_count state    Year daily_count
      <dbl> <chr>   <int>       <dbl>
1         5 Alabama  2012           5
2        14 Alabama  2013           9
3        50 Alabama  2014          36
4         5 NY       2012           5
5        14 NY       2013           9
6        50 NY       2014          36

这是一个基本的 R 选项,通过ave

df <- within(df,daily_count <- ave(cum_count,state,FUN = function(x) c(x[1],diff(x))))

这样

> df
  cum_count   state Year daily_count
1         5 Alabama 2012           5
2        14 Alabama 2013           9
3        50 Alabama 2014          36
4         5      NY 2012           5
5        14      NY 2013           9
6        50      NY 2014          36

推荐阅读