首页 > 解决方案 > 如何解决:“生成器中发生错误:下标越界”

问题描述

我一直在玩 Keras 的神经网络。在尝试应用循环神经网络时,我偶然发现了代码蓝图,但是在实现代码并尝试根据需要对其进行调整时,我总是收到错误消息:

Error occurred in generator: subscript out of bounds
Error in py_call_impl(callable, dots$args, dots$keywords) : 
  StopIteration: 

Detailed traceback: 
  File "/anaconda3/envs/r-tensorflow/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/anaconda3/envs/r-tensorflow/lib/python3.6/site-packages/keras/engine/training.py", line 1418, in fit_generator
    initial_epoch=initial_epoch)
  File "/anaconda3/envs/r-tensorflow/lib/python3.6/site-packages/keras/engine/training_generator.py", line 181, in fit_generator
    generator_output = next(output_generator)
  File "/Library/Frameworks/R.framework/Versions/3.5/Resources/library/reticulate/python/rpytools/generator.py", line 23, in __next__
    return self.next()
  File "/Library/Frameworks/R.framework/Versions/3.5/Resources/library/reticulate/python/rpytools/generator.py", line 40, in next
    raise StopIteration()

我使用的数据框只是一个变量的时间序列。我的怀疑是发电机是罪魁祸首,但我不是 100% 确定。

我很乐意感谢你们的帮助。

我尝试过使用不同版本的 fit_generator() 函数,但每个人都会抛出相同的错误。

generator <- function(data, lookback, delay, min_index, max_index, shuffle = FALSE, batch_size = 128, step = 2) {
    if (is.null(max_index)) max_index <- nrow(data) - delay -   1
   i <- min_index + lookback
   function() {
     if (shuffle) {
       rows <- sample(c((min_index+lookback):max_index),size = batch_size)
     } else {
       if (i + batch_size >= max_index)
         i <<- min_index + lookback
       rows <- c(i:min(i+batch_size, max_index))
       i <<- i + length(rows)
 }
     samples <- array(0, dim = c(length(rows),
                                 lookback / step,
                                 dim(data)[[-1]]))
     targets <- array(0, dim = c(length(rows)))
     for (j in 1:length(rows)) {
       indices <- seq(rows[[j]] - lookback+1, rows[[j]],
                      length.out = dim(samples)[[2]])
       samples[j,,] <- data[indices,]
       targets[[j]] <- data[rows[[j]] + delay,2]
     }
     list(samples, targets)
   }
 }

lookback <- 30
step <- 2
delay <- 365
batch_size <- 128 

train_gen <- generator(
  data,
  lookback = lookback,
  delay = delay,
  min_index = 1,
  max_index = nrow(data),
  shuffle = TRUE,
step = step, 
  batch_size = batch_size
)
val_gen = generator(
  data,
  lookback = lookback,
  delay = delay,
  min_index = floor(nrow(lightning_ts_red)*0.6)+1,
  max_index = floor(nrow(lightning_ts_red)*0.8),
  step = step,
  batch_size = batch_size
) 
test_gen <- generator(
  data,
  lookback = lookback,
  delay = delay,
  min_index = floor(nrow(lightning_ts_red)*0.8)+1,
  max_index = NULL,
  step = step,
  batch_size = batch_size


test_steps <- (nrow(lightning_ts_red) - floor(nrow(lightning_ts_red)*0.8)+1 - lookback) / batch_size


val_steps <- (floor(nrow(lightning_ts_red)*0.8) - floor(nrow(data)*0.6)+1 - lookback) / batch_size
history <- model %>% fit_generator(
train_gen,
steps_per_epoch=500,
epochs=20,
validation_data= val_gen,
validation_steps = val_steps,
verbose=1, view_metrics="auto")

标签: rkerastime-series

解决方案


看看生成器函数中的这一行:

targets[[j]] <- data[rows[[j]] + delay,2]

第二个参数 2 定义要预测的数据列。在最初的例子中(来自 Chollet 和 Allaire),摄氏温度在第二列('T (degC)'),这就是他们试图预测的。

如果您使用的是单变量数据,那么您只有一列,因此生成器函数将抛出“下标越界”错误。

您应该将其更改为 1(如下所示),它应该可以正常工作。

targets[[j]] <- data[rows[[j]] + delay,1]

推荐阅读