首页 > 解决方案 > 在 R 中为时间序列的变化创建指标变量的最简单方法

问题描述

我有一个包含 1400 万行产品、关税税率、贸易量和年月组合的数据集,格式如下:

df <- as.data.frame(matrix(c(1220, "2013-1", 10011900, 29307, .1,
                   1220, "2013-2", 10011900, 28202, .1,
                   1220, "2013-3", 10011900, 22383, .15,
                   1220, "2013-4", 10011900, 21303, .15,
                   1220, "2013-5", 10011900, 21201, .15,
                   1220, "2013-1", 10019900, 9960, .12,
                   1220, "2013-2", 10019900, 10043, .12,
                   1220, "2013-3", 10019900, 11001, .1,
                   1220, "2013-4", 10019900, 10997, .1,
                   1220, "2013-5", 10019900, 12038, .1), 
                 ncol = 5, byrow = T))
colnames(df) <- c("country", "date", "product", "value", "rate" )

我正在尝试在数据中添加一列,这样我就可以用来创建一组指标变量,标记关税税率发生变化之前/之后的几个月。所以,上面看起来像这样:

df_transformed <- as.dataframe(matrix(c(1220, "2013-1", 10011900, 29307, .1, -2, 
                                        1220, "2013-2", 10011900, 28202, .1, -1,
                                        1220, "2013-3", 10011900, 22383, .15, 0, 
                                        1220, "2013-4", 10011900, 21303, .15, 1, 
                                        1220, "2013-5", 10011900, 21201, .15, 2,
                                        1220, "2013-1", 10019900, 9960, .12, -2,
                                        1220, "2013-2", 10019900, 10043, .12, -1,
                                        1220, "2013-3", 10019900, 11001, .1, 0,
                                        1220, "2013-4", 10019900, 10997, .1, 1,
                                        1220, "2013-5", 10019900, 12038, .1, 2)))
colnames(df_transformed) <- c("country", "date", "product", "value", "rate", "months_since_change")

我不确定如何最好地找到关税变量何时发生变化并基于此创建一个新列。

谢谢您的帮助!

标签: rtime-seriespanel-data

解决方案


推荐阅读