首页 > 解决方案 > 问:特殊条件,R 的变化率;ForLoop/应用/滞后

问题描述

我开始接触 R,而且我对时间序列概念是全新的。谁能指出我正确的方向来计算每月的百分比变化。

  1. 我有不同年份、不同月份、不同城镇和价格的数据以及像这样的变化率

.

i  | hrvyear |  m   | town        |   price   |  rate of change
1  |  1270   |  5   | Chesterford |   80      |  NA
2  |  1270   |  6   | Chesterford |   64      |  -20 %
3  |  1270   |  7   | Lopham      |   74      |  NA
4  |  1274   |  12  | Lopham      |   74      |  NA
5  |  1275   |  1   | Lopham      |   78      |  5,4054 % 
6  |  1275   |  2   | Lopham      |   59      |  -24,3589 %
7  |  1275   |  3   | Lopham      |   61      |  3,3898 %
8  |  1275   |  5   | Lopham      |   68      |  NA
  1. 在第二步中,我想取上表中从 9 月开始到 8 月的所有可能对的平均比率( -> 这意味着,9_to_10, 9_to_11, ..., 9_to 8, 10_to_11, .. ., 10_to_8, ... 7_8)

.

i  | start_month | end_month | average_ratio | %change | Std. error | # cases
1  |  9          | 10        |  1,055        | 2,7     |   0.034    | 22
2  |  9          | 11        |   ...         | ...     |   ...      | ..
3  |  9          | 12        |   ...         | ...     |   ...      | ..
4  |  9          | 1         |   ...         | ...     |   ...      | ..
5  |  9          | 2         |   ...         | ...     |   ...      | ..
6  |  9          | 3         |   ...         | ...     |   ...      | ..
7  |  9          | 4         |   ...         | ...     |   ...      | ..
8  |  9          | 5         |   ...         | ...     |   ...      | ..
9  |  9          | 6         |   ...         | ...     |   ...      | ..
10 |  9          | 7         |   ...         | ...     |   ...      | ..
11 |  9          | 8         |   ...         | ...     |   ...      | ..
.. |  ...        | ..        |   ...         | ...     |   ...      | ..
.. |  12         | 1         |   ...         | ...     |   ...      | ..
.. |  12         | 2         |   ...         | ...     |   ...      | ..
.. |  ...        | ..        |   ...         | ...     |   ...      | ..
.. |  12         | 8         |   ...         | ...     |   ...      | ..
.. |  ...        | ..        |   ...         | ...     |   ...      | ..
66 |  7          | 8         |   ...         | ...     |   ...      | ..

计算:

变化率函数: ((ab)/b)*100 ,其中 a 表示新月份,b 表示上个月

average_ratio:所有年份和城镇各月份的平均值

%change: (log(1+mean(average_ratio))/x)*100,其中x是start_month和end_month的距离

structure(list(hrvyear = c(1270, 1270, 1272, 1272, 1275, 1275
), m = c(5, 12, 2, 4, 2, 3), town = c("Chesterford", "Chesterford", 
"Lopham", "Lopham", "Lopham", "Lopham"), `mean(price)` = c(80, 
64, 74, 78, 59, 61)), row.names = c(NA, -6L), groups = structure(list(
    hrvyear = c(1270, 1270, 1272, 1272, 1275, 1275), m = c(5, 
    12, 2, 4, 2, 3), .rows = structure(list(1L, 2L, 3L, 4L, 5L, 
        6L), ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr", 
    "list"))), row.names = c(NA, 6L), class = c("tbl_df", "tbl", 
"data.frame"), .drop = TRUE), class = c("grouped_df", "tbl_df", 
"tbl", "data.frame"))

我希望这个问题很清楚。我很感激任何建议。

标签: rfor-loopif-statementrows

解决方案


您为此编写的函数几乎可以工作,但不要忘记放入am$`mean(price)`[i] - am$`mean(price)`[i-1])括号,这样您就不会在减去之前进行除法。

一个更简单的答案是使用 data.tables 中的shift()函数,它类似于 dplyr 中的lead() lag()函数。他们根据您传递的参数选择之前或之后的行。

library(data.table)
dt <- as.data.table(structure(list(hrvyear = c(1270, 1270, 1272, 1272, 1275, 1275
), m = c(5, 12, 2, 4, 2, 3), town = c("Chesterford", "Chesterford", 
                                      "Lopham", "Lopham", "Lopham", "Lopham"), `mean(price)` = c(80, 
                                                                                                 64, 74, 78, 59, 61)), row.names = c(NA, -6L), groups = structure(list(
                                                                                                   hrvyear = c(1270, 1270, 1272, 1272, 1275, 1275), m = c(5, 
                                                                                                                                                          12, 2, 4, 2, 3), .rows = structure(list(1L, 2L, 3L, 4L, 5L, 
                                                                                                                                                                                                  6L), ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr", 
                                                                                                                                                                                                                                     "list"))), row.names = c(NA, 6L), class = c("tbl_df", "tbl", 
                                                                                                                                                                                                                                                                                 "data.frame"), .drop = TRUE), class = c("grouped_df", "tbl_df", 
                                                                                                                                                                                                                                                                                                                         "tbl", "data.frame")))
 
# this changes the name of your mean(price) 
colnames(dt)[4] <- 'price'

dt[, rate := (price - shift(price))/price * 100]

dt
   hrvyear  m        town price       rate
1:    1270  5 Chesterford    80         NA
2:    1270 12 Chesterford    64 -25.000000
3:    1272  2      Lopham    74  13.513514
4:    1272  4      Lopham    78   5.128205
5:    1275  2      Lopham    59 -32.203390
6:    1275  3      Lopham    61   3.278689

推荐阅读