首页 > 解决方案 > 如何在迭代时间序列时产生大块的滑动窗口?

问题描述

编辑:

假设我有一个时间序列ts = [[0 0][1 1][2 2][3 3][4 4][5 5][6 6][7 7][8 8]],我想分成以下两个序列:

 X = [[[[0][1]][[1][2]][[2][3]]] [[[1][2]][[2][3]][[3][4]]] [[[2][3]][[3][4]][[4][5]]] [[[3][4]][[4][5]][[5][6]]] [[[4][5]][[5][6]][[6][7]]] [[[5][6]][[6][7]][[7][8]]]] 
y = [[3][4][5][6][7][8]]

X 是三个两步滑动窗口的块序列,而 y 是它的特征。我的策略是首先采用以下方法:

def split_sequences(sequences, n_steps):
        X, y = list(), list()
        for i in range(len(sequences)):
        # find the end of this pattern
            end_ix = i + n_steps
            prev_end_ix = end_ix - 1
        # check if we are beyond the dataset
            if end_ix > len(sequences):
                break
        # gather input and output parts of the pattern
            seq_x, seq_y = sequences[i:end_ix, :-1], sequences[prev_end_ix:end_ix, -1]
            X.append(seq_x)
            y.append(seq_y)
        return np.array(X), np.array(y)

哪个回复:

X =[[[0][1]] [[1][2]] [[2][3]] [[3][4]] [[4][5]] [[5][6]] [[6][7]] [[7][8]]] 
y = [[1][2][3][4][5][6][7][8]]

然后我应用以下两种方法来获得所需的输出:

def separar_uni_X(sequencia, n_passos):
    X = list()
    for i in range(len(sequencia)):
        # find the end of this pattern
        end_ix = i + n_passos
        # check if we are beyond the sequence
        if end_ix > len(sequencia):
            break
        # gather input and output parts of the pattern
        seq_x = sequencia[i:end_ix, :]
        X.append(seq_x)
    return np.array(X)

def separar_uni_y(sequencia, n_passos):
    y = list()
    for i in range(len(sequencia)):
        # find the end of this pattern
        end_ix = i + n_passos
        # check if we are beyond the sequence
        if end_ix > len(sequencia):
            break
        # gather input and output parts of the pattern
        seq_y = sequencia[i:end_ix, :]
        y.append(seq_y[-1])
    return np.array(y)

问题:问题在于,为了获得所需的输出,它必须将第一种方法的数据存储到第二种方法中,并且当序列太长时,它会超出内存容量。为了解决这个缺点,我使用了这种方法来分解子流程中的流程:

def split_sequence_3D(sequences, n_steps, batch_size):
    X, y = list(), list()
    for i in range(len(sequences)):
    # find the end of this pattern
        end_ix = i + n_steps
        prev_end_ix = end_ix - 1
    # check if we are beyond the dataset
        if end_ix > len(sequences):
            break
    # gather input and output parts of the pattern
        seq_x, seq_y = sequences[i:end_ix, :-1], sequences[prev_end_ix:end_ix, -1]
        sub_X, sub_y = [], []
        for j in range(batch_size):
            sub_X.append(seq_x)
            sub_y.append(seq_y)
        X.append(sub_X)
        y.append(sub_y[-1])    
    return np.array(X), np.array(y)

这给了我错误的输出,原因很明显:

X = [[[[0][1]][[0][1]][[0][1]]] [[[1][2]][[1][2]][[1][2]]] [[[2][3]][[2][3]][[2]   [3]]] [[[3][4]][[3][4]][[3][4]]] [[[4][5]][[4][5]][[4][5]]] [[[5][6]][[5][6]][[5 [6]]] [[[6][7]][[6][7]][[6][7]]] [[[7][8]][[7][8]][[7][8]]]] 
y = [[1][2][3][4][5][6][7][8]]

我已经广泛寻找替代方案,但没有找到。

标签: pythontime-seriesappendchunks

解决方案


好吧,我真的很痛苦地解决了你的问题,这也是我的问题。但最终,解决方案变得很简单。我的解决方案是让滑动窗口迭代器也滑动。

def input_3D(sequencia, lote, janela):
        if lote > len(sequencia):
            raise ValueError('Tamanho do lote maior que o conjunto dos dados')
        if janela > len(sequencia):
            raise ValueError('Tamanho da janela maior que o conjunto dos dados')    
        X_, y_ = [], []
        for j in range (len(sequencia)):
            if j+lote+janela > len(sequencia):
                break
            X, y = [], []
            for i in range (j,j+lote,1):
                end_ix = i+janela
                prev_end_ix = end_ix - 1
                seq_x, seq_y = sequencia[i:end_ix, :-1], sequencia[prev_end_ix:end_ix, -1]
                X.append(np.array(seq_x))
                y.append(np.array(seq_y[-1]))
            X_.append(np.array(X))
            y_.append(np.array(y[-1]))
        return np.array(X_), np.array(y_)

假设您的输入是:

arr_x = list(range(0,100))
arr_y = list(range(0,100))
arr = np.stack([arr_x,arr_y])
arr = arr.T

那么你的输出将是:

[[[[ 0]
   [ 1]
   [ 2]
   ...
   [ 7]
   [ 8]
   [ 9]]

  [[ 1]
   [ 2]
   [ 3]
   ...
   [ 8]
   [ 9]
   [10]]

  [[ 2]
   [ 3]
   [ 4]
   ...
   [ 9]
   [10]
   [11]]

  [[ 3]
   [ 4]
   [ 5]
   ...
   [10]
   [11]
   [12]]]

...

 [[[86]
   [87]
   [88]
   ...
   [93]
   [94]
   [95]]

  [[87]
   [88]
   [89]
   ...
   [94]
   [95]
   [96]]

  [[88]
   [89]
   [90]
   ...
   [95]
   [96]
   [97]]

  [[89]
   [90]
   [91]
   ...
   [96]
   [97]
   [98]]]] [12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 
 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83
 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98]

推荐阅读