首页 > 解决方案 > 内存错误的原因。Numpy除以标准差

问题描述

我正在尝试为 MNIST 数据集标准化我的训练和测试集。这是我的代码

import numpy as np
import pandas as pd

prediction = pd.read_csv("sample_submission.csv")
test_csv = pd.read_csv("test.csv")
train_csv = pd.read_csv("train.csv")

train = train_csv.values.T  # turn train set data frame to numpy array
test = test_csv.values.T
y_values = train[[0], :]  # bring y values  [3,1,4,6,2,0,...]
train = train[1:, :]

y = np.zeros((10, y_values.shape[1]))
for i in range(y_values.shape[1]):
    y[y_values[0][i]][i] = 1  # one-hot encoding

# scaling data set values to range (0,1)
train = np.divide(train, np.std(train))
test = np.divide(test, np.std(test))

一切似乎都在工作,只是它在我尝试将测试集与其标准偏差分开的最后一部分给了我记忆错误。

Traceback (most recent call last):
  File "C:/Users/falco/PycharmProjects/Digit-Recognizer/main.py", line 26, in <module>
    test = np.divide(test, np.std(test))
  File "C:\Users\falco\Anaconda3\lib\site-packages\numpy\core\fromnumeric.py", line 3242, in std
    **kwargs)
  File "C:\Users\falco\Anaconda3\lib\site-packages\numpy\core\_methods.py", line 140, in _std
    keepdims=keepdims)
  File "C:\Users\falco\Anaconda3\lib\site-packages\numpy\core\_methods.py", line 117, in _var
    x = asanyarray(arr - arrmean)
MemoryError

任何关于为什么会发生这种情况的帮助/想法将不胜感激!

标签: pythonnumpymemory

解决方案


推荐阅读