首页 > 解决方案 > After scaling or normalizing values the graph is changed Python Sklearn pandas

问题描述

I am trying to scale or normalize the inputs of my training dataset using the sklearn.preprocessing.MinMaxScaler(). After normalizing the input, I came to know very weird output graph. It was not like the input before normalization. Please have a look at the issue: reference for normalization of pandas columns: https://stackoverflow.com/a/26415620/4948889

def normalize_data(df):
    cols = list(df_stock.columns.values)
    x = df.values #returns a numpy array
    min_max_scaler = sklearn.preprocessing.MinMaxScaler()
    x_scaled = min_max_scaler.fit_transform(x)
    df = pd.DataFrame(x_scaled)
    df.columns = cols
    return df
df_stock = df.copy()
cols = list(df_stock.columns.values)
print('df_stock.columns.values = ', cols)
df_stock_norm = df_stock.copy()
df_stock_norm = normalize_data(df_stock_norm)

See the before and after graphs of the input dataframes. Before:

plt.figure(figsize=(15, 5));
plt.subplot(1,2,1);
plt.plot(df.open.values[:20], color='red', label='open')
plt.plot(df.close.values[:20], color='green', label='close')
plt.plot(df.low.values[:20], color='blue', label='low')
plt.plot(df.high.values[:20], color='black', label='high')
plt.title('stock price')
plt.xlabel('time [days]')
plt.ylabel('price')
plt.legend(loc='best')
plt.show()

output:
before input

After:

  plt.figure(figsize=(15, 5));
    plt.plot(df_stock_norm.open.values[:20], color='red', label='open')
    plt.plot(df_stock_norm.close.values[:20], color='green', label='low')
    plt.plot(df_stock_norm.low.values[:20], color='blue', label='low')
    plt.plot(df_stock_norm.high.values[:20], color='black', label='high')
    plt.title('stock')
    plt.xlabel('time [days]')
    plt.ylabel('normalized price')
    plt.legend(loc='best')
    plt.show()

output:
enter image description here

Explanation:
Please see the high and low values on both the graphs. They are weird and different, which they should be, even after normalization.

Please let me know what I had mistaken.
EDITED:

Have this as sample dataset for testing the above

标签: pythonpython-3.xpandasscikit-learn

解决方案


推荐阅读