首页 > 解决方案 > 如何更改 CNN 或 LSTM 的阵列形状?

问题描述

我有一个大约 250,000 x 300 的表格,其中包含来自惯性测量单元的数据。

+----------+-----+-----------+----------+-----+-----------+----------+-----+-----------+
| acce_x_0 | ... | acce_x_99 | acce_y_0 | ... | acce_y_99 | acce_z_0 | ... | acce_z_99 |
+----------+-----+-----------+----------+-----+-----------+----------+-----+-----------+
| 1.3435   | ... | 1.7688    | -0.4566  | ... | -1.4554   | 9.6564   | ... | 9.5768    |
+----------+-----+-----------+----------+-----+-----------+----------+-----+-----------+

我想得到一个像图片中的张量。

在此处输入图像描述

但是当试图改变数组的形式时np.reshape(data_imu.to_numpy(), newshape=(-1, 100, 3)),我得到了不同的看法。

例如,data_imu[0][0].shape给出 3 而不是我预期的 100。

标签: pythonnumpydeep-learning

解决方案


据我了解,每个加速度分量 x、y、z 都有几个时间序列样本。

一种解决方案是分离每个组件的数据,然后重新组装它们以构建 3D 阵列。

这是一个简单的例子:



data = np.random.uniform(-1,1,size=(100,10))
#creating data, 10 samples of timeseries with 100 values

df = pd.DataFrame(data=data,columns= ['device_sample_'+str(i) for i in range(10)])


"""
device_0  device_1  device_2  device_3  device_4  device_5  device_6  device_7 device_8 device_9
0  0.846339  0.014831  0.380373  0.910142  0.283169  0.926771  0.651504  0.267011 -0.735348 -0.563671
1 -0.076040 -0.107705  0.783594 -0.731901  0.328230  0.104527  0.373363  0.135972  0.145868 -0.068370
2 -0.914331 -0.106772 -0.111691 -0.747672 -0.367210  0.293646  0.278765 -0.659683  0.464896  0.675855
3  0.008376  0.823489  0.017261  0.540690 -0.052503  0.396828 -0.219417 -0.872403 -0.631343  0.288238
4 -0.317125  0.662676 -0.912503 -0.047759  0.286468 -0.938535 -0.962357  0.922892  0.168540  0.847411
"""
#To make it simple, let's say we have 2 devices, even samples for first device
#and other for the second one


#first we regroup desired samples corresponding to each device


df_device1=df[['device_sample_0','device_sample_2','device_sample_4','device_sample_6','device_sample_8']]

#Can loop to select columns


df_device2=df[['device_sample_1','device_sample_3','device_sample_5','device_sample_7','device_sample_9']]


data_dev1 = df_device1.to_numpy()


data_dev2 = df_device2.to_numpy()


print(data_dev1.shape)
# (100,5) device 1 has 5 samples of timeseries
print(data_dev2.shape)


#Now you build you 3D array 

final_data = np.dstack((data_dev1, data_dev2))

print(final_data.shape)

#(100, 5, 2)
# lines : timeseries
# columns : samples
# depth : devices

# Different from the picture, but you can use reshape(5,100,2) to modifiy dimensions

推荐阅读