首页 > 解决方案 > 将 2D 矩阵重塑为具有 Keras 滞后的 3D 矩阵

问题描述

我正在尝试在 Keras 中创建一个 LSTM,但我无法重塑输入数据。

让我们考虑 3 个特征的 25 个观察值:

x <- 1:25
y <- seq(100, 2500, by = 100)
z <- seq(1000, 25000, by = 1000)

my.matrix <- data.matrix(data.frame(x, y, z))
str(my.matrix)

这给出了:

> str(my.matrix)
 num [1:25, 1:3] 1 2 3 4 5 6 7 8 9 10 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:3] "x" "y" "z"

还:

> my.matrix
       x    y     z
 [1,]  1  100  1000
 [2,]  2  200  2000
 [3,]  3  300  3000
 [4,]  4  400  4000
 [5,]  5  500  5000
 [6,]  6  600  6000
 [7,]  7  700  7000
 [8,]  8  800  8000
 [9,]  9  900  9000
[10,] 10 1000 10000
[11,] 11 1100 11000
[12,] 12 1200 12000
[13,] 13 1300 13000
[14,] 14 1400 14000
[15,] 15 1500 15000
[16,] 16 1600 16000
[17,] 17 1700 17000
[18,] 18 1800 18000
[19,] 19 1900 19000
[20,] 20 2000 20000
[21,] 21 2100 21000
[22,] 22 2200 22000
[23,] 23 2300 23000
[24,] 24 2400 24000
[25,] 25 2500 25000

现在我需要创建一个 3D 矩阵,其尺寸为:[nb.observations, window.width, features]。在我的情况下:例如[25, 5, 3],其中 window.width=5是观察滚动窗口的宽度。

编辑:实际上,最终尺寸将是[21, 5, 3],因为滚动窗口宽度(例如 x 特征的最后一个样本将是 [21, 22, 23, 24, 25] )。

我试图做的是以下内容:

window.width <- 5
tmp <- NULL
for(i in 1:(dim(my.matrix)[1] - window.width + 1)) {
  s <- i - 1 + window.width
  tmp <- rbind(tmp, my.matrix[i:s,])
}

我们有:

> head(tmp, 10)
      x   y    z
 [1,] 1 100 1000
 [2,] 2 200 2000
 [3,] 3 300 3000
 [4,] 4 400 4000
 [5,] 5 500 5000
 [6,] 2 200 2000
 [7,] 3 300 3000
 [8,] 4 400 4000
 [9,] 5 500 5000
[10,] 6 600 6000

这就是我想要的。如果我们查看x特征,第一个窗口从 1 到 5,然后是第二个窗口从 2 到 6,依此类推。所有特征都相同。

现在,我需要重塑 tmp 矩阵:

result <- array(tmp, dim=c(dim(my.matrix)[1] - window.width + 1, window.width, dim(my.matrix)[2]))

但这不起作用:

> result[1, ,1]
[1]  1  6 11 16 21

我期待:

> result[1, ,1]
[1]  1  2 3 4 5


> result[2, ,1]
[1]  2  3 4 5 6

我也尝试使用 lag 函数来替换for循环,但它也不起作用:

result <- array(data = lag(my.matrix, window.width)[-(1:window.width), ], dim = c(dim(my.matrix)[1] - window.width, window.width, 3))

> result[1, ,1]
[1]    1  100 1000    1  100

1)我做错了什么,我怎样才能得到预期的结果?

2)此外,for循环似乎不能很好地扩展。它做了我想做的事,但是有了更多的数据,它变得非常慢(我尝试了 150,000 次观察和 23 个特征)。会有更快的选择吗?

编辑:实际上,for循环几乎可以使用

result <- array(tmp, dim=c(5, 21, 3))

矩阵值是正确的,但是尺寸都混淆了......

> result
, , 1

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21]
[1,]    1    2    3    4    5    6    7    8    9    10    11    12    13    14    15    16    17    18    19    20    21
[2,]    2    3    4    5    6    7    8    9   10    11    12    13    14    15    16    17    18    19    20    21    22
[3,]    3    4    5    6    7    8    9   10   11    12    13    14    15    16    17    18    19    20    21    22    23
[4,]    4    5    6    7    8    9   10   11   12    13    14    15    16    17    18    19    20    21    22    23    24
[5,]    5    6    7    8    9   10   11   12   13    14    15    16    17    18    19    20    21    22    23    24    25

, , 2

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21]
[1,]  100  200  300  400  500  600  700  800  900  1000  1100  1200  1300  1400  1500  1600  1700  1800  1900  2000  2100
[2,]  200  300  400  500  600  700  800  900 1000  1100  1200  1300  1400  1500  1600  1700  1800  1900  2000  2100  2200
[3,]  300  400  500  600  700  800  900 1000 1100  1200  1300  1400  1500  1600  1700  1800  1900  2000  2100  2200  2300
[4,]  400  500  600  700  800  900 1000 1100 1200  1300  1400  1500  1600  1700  1800  1900  2000  2100  2200  2300  2400
[5,]  500  600  700  800  900 1000 1100 1200 1300  1400  1500  1600  1700  1800  1900  2000  2100  2200  2300  2400  2500

, , 3

     [,1] [,2] [,3] [,4] [,5]  [,6]  [,7]  [,8]  [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21]
[1,] 1000 2000 3000 4000 5000  6000  7000  8000  9000 10000 11000 12000 13000 14000 15000 16000 17000 18000 19000 20000 21000
[2,] 2000 3000 4000 5000 6000  7000  8000  9000 10000 11000 12000 13000 14000 15000 16000 17000 18000 19000 20000 21000 22000
[3,] 3000 4000 5000 6000 7000  8000  9000 10000 11000 12000 13000 14000 15000 16000 17000 18000 19000 20000 21000 22000 23000
[4,] 4000 5000 6000 7000 8000  9000 10000 11000 12000 13000 14000 15000 16000 17000 18000 19000 20000 21000 22000 23000 24000
[5,] 5000 6000 7000 8000 9000 10000 11000 12000 13000 14000 15000 16000 17000 18000 19000 20000 21000 22000 23000 24000 25000

如何交换尺寸?

标签: rkeraslstm

解决方案


我设法让它工作,但这可能不是R 方式......

x <- 1:25
y <- seq(100, 2500, by = 100)
z <- seq(1000, 25000, by = 1000)

my.matrix <- data.matrix(data.frame(x, y, z))
my.matrix <- cbind(x, y, z)
str(my.matrix)

window.width <- 5

result <- array(data = NA_real_, dim = c(dim(my.matrix)[1] - window.width + 1, window.width, dim(my.matrix)[2]))
# Loop over features
for (k in 1:dim(my.matrix)[2]) {
  # Loop over window
  for (j in 1:window.width) {
    # Loop over observations
    for(i in 1:(dim(my.matrix)[1] - window.width + 1)) {
      result[i, j, k] = my.matrix[i - 1 + j, k]
    }
  }
}

如果有人找到更好,更有效的方法来做到这一点,我将不胜感激。与此同时,这有效:

> result
, , 1

      [,1] [,2] [,3] [,4] [,5]
 [1,]    1    2    3    4    5
 [2,]    2    3    4    5    6
 [3,]    3    4    5    6    7
 [4,]    4    5    6    7    8
 [5,]    5    6    7    8    9
 [6,]    6    7    8    9   10
 [7,]    7    8    9   10   11
 [8,]    8    9   10   11   12
 [9,]    9   10   11   12   13
[10,]   10   11   12   13   14
[11,]   11   12   13   14   15
[12,]   12   13   14   15   16
[13,]   13   14   15   16   17
[14,]   14   15   16   17   18
[15,]   15   16   17   18   19
[16,]   16   17   18   19   20
[17,]   17   18   19   20   21
[18,]   18   19   20   21   22
[19,]   19   20   21   22   23
[20,]   20   21   22   23   24
[21,]   21   22   23   24   25

, , 2

      [,1] [,2] [,3] [,4] [,5]
 [1,]  100  200  300  400  500
 [2,]  200  300  400  500  600
 [3,]  300  400  500  600  700
 [4,]  400  500  600  700  800
 [5,]  500  600  700  800  900
 [6,]  600  700  800  900 1000
 [7,]  700  800  900 1000 1100
 [8,]  800  900 1000 1100 1200
 [9,]  900 1000 1100 1200 1300
[10,] 1000 1100 1200 1300 1400
[11,] 1100 1200 1300 1400 1500
[12,] 1200 1300 1400 1500 1600
[13,] 1300 1400 1500 1600 1700
[14,] 1400 1500 1600 1700 1800
[15,] 1500 1600 1700 1800 1900
[16,] 1600 1700 1800 1900 2000
[17,] 1700 1800 1900 2000 2100
[18,] 1800 1900 2000 2100 2200
[19,] 1900 2000 2100 2200 2300
[20,] 2000 2100 2200 2300 2400
[21,] 2100 2200 2300 2400 2500

, , 3

       [,1]  [,2]  [,3]  [,4]  [,5]
 [1,]  1000  2000  3000  4000  5000
 [2,]  2000  3000  4000  5000  6000
 [3,]  3000  4000  5000  6000  7000
 [4,]  4000  5000  6000  7000  8000
 [5,]  5000  6000  7000  8000  9000
 [6,]  6000  7000  8000  9000 10000
 [7,]  7000  8000  9000 10000 11000
 [8,]  8000  9000 10000 11000 12000
 [9,]  9000 10000 11000 12000 13000
[10,] 10000 11000 12000 13000 14000
[11,] 11000 12000 13000 14000 15000
[12,] 12000 13000 14000 15000 16000
[13,] 13000 14000 15000 16000 17000
[14,] 14000 15000 16000 17000 18000
[15,] 15000 16000 17000 18000 19000
[16,] 16000 17000 18000 19000 20000
[17,] 17000 18000 19000 20000 21000
[18,] 18000 19000 20000 21000 22000
[19,] 19000 20000 21000 22000 23000
[20,] 20000 21000 22000 23000 24000
[21,] 21000 22000 23000 24000 25000

推荐阅读