首页 > 解决方案 > Keras 多层感知器训练数据显示 loss = nan

问题描述

我在 data_2.csv 中有这样的数据。

a   b   c   d        e         outcome
2   9   5   10175   3500        10000
1   3   4   23085   35000       34000
2   1   3   NaN     23283.33333 50000
....

我尝试使用 MLP 进行训练。列结果是目标输出。这是我的代码。

df = pd.read_csv('C://data_2.csv')

sc = MinMaxScaler()
X = sc.fit_transform(df.drop('income', axis=1).astype(float))

test= df[['outcome']]

y = sc.fit_transform(test.astype(float))

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=20, test_size=0.1)

model = Sequential()
model.add(Dense(32,input_shape=(5,), activation='relu'))
model.add(Dense(32,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1,activation='softmax'))
model.compile(loss='mean_squared_error', optimizer='adam')
model.summary()

model.fit(X_train, y_train, epochs=200, batch_size=32, verbose=1)

y_pred = model.predict(X_test)

print("##########################################")
print(y_pred)

当我训练数据时,它显示损失:nan like this

Epoch 1/200
45000/45000 [==============================] - 2s 48us/step - loss: nan
Epoch 2/200
45000/45000 [==============================] - 2s 38us/step - loss: nan

完成训练后,它会显示这样的输出。

##########################################
[[nan]
 [nan]
 [nan]
 ...
 [nan]
 [nan]
 [nan]]

X_train.shape 为 (45000, 5) y_train.shape 为 (45000, 1) 所有输出均为 NaN。如何解决?

标签: pythontensorflowmachine-learningkeras

解决方案


The prominent problem in your code is that you aren't cleaning your data. Neural Networks behave, in simple terms, by multiplying each node on each layer (that's a Dense layer). Then, imagine this: you have 32 nodes on the first layer, the largest positive number you have is about 35,000. If you multiply this 35,000 (more or less depending on weight and bias) by itself for 32 times, your number will be over the limit and will end up with NaN in just a few epochs.

Thus, your problem is with your activator, relu. This activator only filters the positive number (zero or greater) and turns any negative numbers to zero. With this activator, your initial nodes will have astronomical numbers!

I recommend changing your activator into a sigmoid function. This function scales a number between 1 and -1 (mostly). With this, your large inputs will be turned to numbers with absolute values of less than 1.

Hope this helps.


推荐阅读