首页 > 解决方案 > 如何在 python 中正确重塑 ConvLSTM 的数据?

问题描述

我正在尝试在此处学习本教程,但我遇到了 ConvLSTM 模型的问题。

所以在教程中他们有一个数组,[10,20,30,40,50,60,70,80,90],他们把它分成

[[10,20,30]
 [20,30,40]
 [30,40,50]
 [40,50,60]
 [50,60,70]
 [60,70,80]
 [70,80,90]

当他们使用重塑它时

n_features = 1
n_seq = 2
n_steps = 2
X = X.reshape((X.shape[0], n_seq, n_steps, n_features))

它看起来像这样

[[[10]
  [20]
  [30]]

 [[20]
  [30]
  [40]] ...

我的问题是,当我尝试导入我的数据并使用它时,我的数据具有这样的(133460,20)形状

[[1,2,3...20]
 [1,2,3....20]]

我最终遇到的错误是

ValueError:无法将大小为 2001900 的数组重塑为形状 (1000095,2,2,1)

我只是对这一切以及如何正确重塑我的数据感到有点困惑。这是完整的代码,并说明了我的代码失败的地方。

这是教程代码,但我不需要该splitSequence()函数,因为我的数据已经拆分(与训练/测试拆分相同。

教程代码


# split a univariate sequence into samples
def split_sequence(sequence, n_steps):
    X, y = list(), list()
    for i in range(len(sequence)):
        # find the end of this pattern
        end_ix = i + n_steps
        # check if we are beyond the sequence
        if end_ix > len(sequence)-1:
            break
        # gather input and output parts of the pattern
        seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
        X.append(seq_x)
        y.append(seq_y)
    return array(X), array(y)

# define input sequence
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
# choose a number of time steps
n_steps = 4
# split into samples
X, y = split_sequence(raw_seq, n_steps)
# reshape from [samples, timesteps] into [samples, subsequences, timesteps, features]
n_features = 1
n_seq = 2
n_steps = 2
X = X.reshape((X.shape[0], n_seq, n_steps, n_features))
# define model
model = Sequential()
model.add(TimeDistributed(Conv1D(filters=64, kernel_size=1, activation='relu'), input_shape=(None, n_steps, n_features)))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(50, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=500, verbose=0)
# demonstrate prediction
x_input = array([60, 70, 80, 90])
x_input = x_input.reshape((1, n_seq, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
>> [[101.69263]]

我的代码

# reshape from [samples, timesteps] into [samples, subsequences, timesteps, features]
n_features = 1
n_seq = 2
n_steps = 2
X = X_train.reshape((X_train.shape[0], n_seq, n_steps, n_features)) <<<<< My code fails here
# define model
model = Sequential()
model.add(TimeDistributed(Conv1D(filters=64, kernel_size=1, activation='relu'), input_shape=(None, n_steps, n_features)))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(50, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=500, verbose=0)

所有这一切都很新,所以任何帮助(特别是在塑造数据方面)将不胜感激

标签: pythonpandasnumpymachine-learninglstm

解决方案


抱歉,我运行了您的代码,它与教程代码的结果相匹配。虽然我被迫用 X 替换 X_train 因为 X_train 没有在你的代码中定义。因此,如果我不得不说您的问题在于 X_train,对不起,我无法提供更多帮助。如果您澄清 X_train 是什么,或者更具体地说是什么使它与 X 不同,那么我可以提供更具体的解决方案。


推荐阅读