首页 > 解决方案 > 了解 Conv2D 和 LSTM

问题描述

我正在尝试预测下一张图片。所以我使用 Conv2D 和 LSTM 而不是 ConvLSTM2D。原因是我想了解每一步,因为我是初学者。我的问题是: 1)我想写一个生成器方法来返回 X 和 y。我认为 X 的形状将是(样本数、时间步长、高度、宽度、通道),但我不确定 y 的形状。(例如,如果我们想从数据集中预测温度,那么在生成器方法中,X 的形状将是(样本数、时间步长、特征数),y 的形状将是(样本数)。 ) 2) 在网络架构中,最初(在 Con2D 部分中)输入形状将是 5D 张量(样本数、时间步长、高度、宽度、通道)。然后在 LSTM 部分中,输入形状将是 3D 张量(样本数、时间步长、高度宽度频道)。当我预测图像时,我应该在 LSTM 的末尾使用 Reshape(height, width, channels) 吗?这是我的代码:

model = Sequential()

model.add(TimeDistributed(Convolution2D(128, (5, 5)), input_shape=(None, 512, 512, 1)))
model.add(Activation('relu')) 
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model.add(TimeDistributed(Convolution2D(128, (5, 5))))
model.add(Activation('relu')) 
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model.add(TimeDistributed(Convolution2D(256, (5, 5))))
model.add(Activation('relu')) 
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model.add(TimeDistributed(Convolution2D(256, (5, 5))))
model.add(Activation('relu')) 
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model.add(TimeDistributed(Convolution2D(512, (5, 5))))
model.add(Activation('relu')) 
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model.add(TimeDistributed(Convolution2D(512, (5, 5))))
model.add(Activation('relu')) 
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model.add(TimeDistributed(Convolution2D(512, (3, 3))))
model.add(Activation('relu')) 
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model.add(TimeDistributed(Flatten()))


model.add(LSTM(units=512, activation='tanh', return_sequences=True)) 
model.add(Dense(262144, activation='linear'))
model.add(TimeDistributed(Reshape((512,512,1))))
model.summary()

def generator(data, lookback, delay, min_index, max_index,
          shuffle=False, batch_size=128, step=6):
    if max_index is None:
        max_index = len(data) - delay - 1
    i = min_index + lookback
    while 1:
        if shuffle:
            rows = np.random.randint(
                min_index + lookback, max_index, size=batch_size)
        else:
            if i + batch_size >= max_index:
                i = min_index + lookback
            rows = np.arange(i, min(i + batch_size, max_index))
            i += len(rows)

        samples = np.zeros((len(rows), lookback // step, data.shape[1], data.shape[2], data.shape[3]))
        targets = np.zeros((len(rows), 1, data.shape[1], data.shape[2], data.shape[3]))
        for j, row in enumerate(rows):
            indices = range(rows[j] - lookback, rows[j], step)
            samples[j] = data[indices]
            targets[j] = data[rows[j] + delay]
        yield samples, targets

lookback = 3
step = 1
delay = 1
batch_size = 2

train_gen = generator(image_dataset,
                  lookback=lookback,
                  delay=delay,
                  min_index=0,
                  max_index=19,
                  shuffle=False,
                  step=step, 
                  batch_size=batch_size)

val_gen = generator(image_dataset,
                lookback=lookback,
                delay=delay,
                min_index=20,
                max_index=29,
                step=step,
                batch_size=batch_size)


val_steps = (29 - 20 - lookback) // batch_size

model.compile(optimizer=RMSprop(lr = 0.001), loss='mae')

history = model.fit_generator(train_gen,
                          steps_per_epoch=10,
                          epochs=5,
                          validation_data=val_gen,
                          validation_steps=val_steps)

提前致谢 !

标签: pythonkerasdeep-learningconv-neural-networklstm

解决方案


推荐阅读