首页 > 解决方案 > 如何缩放其中包含日期时间字段的数据框(作为索引)?

问题描述

我想缩放一个数据框,这会引发标题(或以下)中的错误。

我的数据:

df.head()

timestamp   open    high    low close   volume
0   2020-06-25  303.4700    305.26  301.2800    304.16  46340400
1   2020-06-24  309.8400    310.51  302.1000    304.09  123867696
2   2020-06-23  313.4801    314.50  311.6101    312.05  68066900
3   2020-06-22  307.9900    311.05  306.7500    310.62  74007212
4   2020-06-19  314.1700    314.38  306.5300    308.64  135211345

我的代码:

# Converting the index as date
from datetime import datetime

df.index = pd.to_datetime(df.index)

# Split data
split = len(df) - int(len(df) * 0.8)
df_train = df.iloc[split:]
df_test = df.iloc[:split]

# Normalize
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()
df_train = df_train.values.reshape(-1,1) #df_train = scaler.fit_transform(df_train)
df_test = df_test.values.reshape(-1,1) #df_test = scaler.fit_transform(df_train)

# Train the Scaler with training data and smooth data
timestep = 21
for i in range(0,len(df),timestep):
    df_train = scaler.fit_transform(df_train[i:i+timestep,:])
    #train_data[di:di+smoothing_window_size,:] = scaler.transform(train_data[di:di+smoothing_window_size,:])

# You normalize the last bit of remaining data
df_test = scaler.fit_transform(df_test[i+timestep:,:])
#train_data[di+timestep:,:] = scaler.transform(train_data[di+timestep:,:])

错误:

      2 timestep = 21
      3 for i in range(0,len(df),timestep):
----> 4     df_train = scaler.fit_transform(df_train[i:i+timestep,:])
      5     #train_data[di:di+smoothing_window_size,:] = scaler.transform(train_data[di:di+smoothing_window_size,:])

ValueError:无法将字符串转换为浮点数:'2020-05-28'

将得到帮助。

标签: pythonpandasnumpy

解决方案


推荐阅读