首页 > 解决方案 > scala中的滚动时间窗口数据

问题描述

请在下面找到生成随机Day->Data映射并尝试计算3 天滚动时间窗口数据的简化 scala 代码片段:-

val dataByDay: Map[String, String] = TreeMap((1 to 7).map(i => (s"Day$i" -> s"Data-$i")): _*)

val groupedIterator: Iterator[(Int, Map[String, String])] = dataByDay.sliding(3).zipWithIndex.map(e => ((e._2 + 1) -> e._1))

for ((day, lastFiveDaysDataOnEveryDay) <- groupedIterator) {
  println(s"On Day${day} data for days " + lastFiveDaysDataOnEveryDay.keys.mkString(",") + " will be used")
}

上面的输出是: -

On Day1 data for days Day1,Day2,Day3 will be used
On Day2 data for days Day2,Day3,Day4 will be used
On Day3 data for days Day3,Day4,Day5 will be used
On Day4 data for days Day4,Day5,Day6 will be used
On Day5 data for days Day5,Day6,Day7 will be used

要求是处理数据如下所示: -

On Day1 data for days will be used
On Day2 data for days Day1 will be used
On Day3 data for days Day1,Day2 will be used
On Day4 data for days Day1,Day2,Day3 will be used
On Day5 data for days Day2,Day3,Day4 will be used
On Day6 data for days Day3,Day4,Day5 will be used
On Day7 data for days Day4,Day5,Day6 will be used

请建议。

标签: scalaaggregatesliding-window

解决方案


你的要求有点模糊。如果您只需要该输出,那么一个简单的解决方案就是这样。

(1 to 7).foreach { day =>
  val prior = Seq(day-3,day-2,day-1).filter(_>0).map("Day" + _)
  println(s"On Day$day data for days${prior.mkString(",")} will be used")
}

如果要求是可配置滚动窗口的数据表示,则需要更精确的信息。


推荐阅读