首页 > 解决方案 > 构造循环以计算 R 中的差异

问题描述

我有以下数据集,我需要在其中构建一个循环来计算引用前一行的一行的时间差异。一旦计算出差异,我想将该值放在一个名为“差异”的新列中。我坚持的主要部分是编写一个循环来计算从交易关闭到打开时的差异。这由买入/卖出(开仓)到(t/p 或 s/l)(平仓)表示。我有一个大数据框,所以我想在整个数据框中应用循环。

我已经生成了我希望循环执行的示例数据框。任何建议都会非常有帮助。

这是我想要的 df: 贸易数据示例

这是我的数据的当前结构:

structure(list(ID = 1:10, Year = c(2005L, 2005L, 2005L, 2005L, 
2005L, 2005L, 2005L, 2005L, 2005L, 2005L), Month = c(1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), Day = c(7L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 7L, 7L), Time = c("4:00", "5:30", "6:00", "8:20", 
"9:00", "9:06", "10:00", "10:20", "11:00", "11:20"), x3 = c("buy", 
"t/p", "sell", "t/p", "buy", "t/p", "sell", "t/p", "buy", "t/p"
), x4 = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L), x5 = c(5.06, 
5.06, 3, 3, 5.3, 5.3, 3, 3, 5.53, 5.53), x6 = c(1.88219, 1.88319, 
1.88357, 1.88257, 1.88149, 1.88249, 1.88167, 1.88067, 1.88089, 
1.88189), x7 = c(1.87664, 1.87664, 1.90464, 1.90464, 1.87664, 
1.87664, 1.90464, 1.90464, 1.87664, 1.87664), x8 = c(1.88319, 
1.88319, 1.88257, 1.88257, 1.88249, 1.88249, 1.88067, 1.88067, 
1.88189, 1.88189), x9 = c(0, 342.41, 0, 203.03, 0, 358.64, 0, 
203.03, 0, 374.21), x10 = c(12000, 12342.41, 12342.41, 12545.44, 
12545.44, 12904.08, 12904.08, 13107.11, 13107.11, 13481.32), 
    Difference = c("", "1:30", "", "2:20", "", "0:06", "", "0:20", 
    "", "0:20")), row.names = c(NA, 10L), class = "data.frame")

标签: rloops

解决方案


如果您有一对行的 ID(例如x4在您的示例中),则可以将其用于group_by.

然后,您可以对该组中的最小和最大时间进行区分,并将其分配给最大时间的行。

df %>% 
  group_by( x4 ) %>%
  mutate(difference = ifelse(time == max(time), max(time) - min(time), NA))

推荐阅读