python - Keras 多层感知器训练数据显示 loss = nan
问题描述
我在 data_2.csv 中有这样的数据。
a b c d e outcome
2 9 5 10175 3500 10000
1 3 4 23085 35000 34000
2 1 3 NaN 23283.33333 50000
....
我尝试使用 MLP 进行训练。列结果是目标输出。这是我的代码。
df = pd.read_csv('C://data_2.csv')
sc = MinMaxScaler()
X = sc.fit_transform(df.drop('income', axis=1).astype(float))
test= df[['outcome']]
y = sc.fit_transform(test.astype(float))
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=20, test_size=0.1)
model = Sequential()
model.add(Dense(32,input_shape=(5,), activation='relu'))
model.add(Dense(32,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1,activation='softmax'))
model.compile(loss='mean_squared_error', optimizer='adam')
model.summary()
model.fit(X_train, y_train, epochs=200, batch_size=32, verbose=1)
y_pred = model.predict(X_test)
print("##########################################")
print(y_pred)
当我训练数据时,它显示损失:nan like this
Epoch 1/200
45000/45000 [==============================] - 2s 48us/step - loss: nan
Epoch 2/200
45000/45000 [==============================] - 2s 38us/step - loss: nan
完成训练后,它会显示这样的输出。
##########################################
[[nan]
[nan]
[nan]
...
[nan]
[nan]
[nan]]
X_train.shape 为 (45000, 5) y_train.shape 为 (45000, 1) 所有输出均为 NaN。如何解决?
解决方案
The prominent problem in your code is that you aren't cleaning your data. Neural Networks behave, in simple terms, by multiplying each node on each layer (that's a Dense layer). Then, imagine this: you have 32 nodes on the first layer, the largest positive number you have is about 35,000. If you multiply this 35,000 (more or less depending on weight and bias) by itself for 32 times, your number will be over the limit and will end up with NaN in just a few epochs.
Thus, your problem is with your activator, relu
. This activator only filters the positive number (zero or greater) and turns any negative numbers to zero. With this activator, your initial nodes will have astronomical numbers!
I recommend changing your activator into a sigmoid
function. This function scales a number between 1 and -1 (mostly). With this, your large inputs will be turned to numbers with absolute values of less than 1.
Hope this helps.
推荐阅读
- c - 删除数组中的空内容并在 C 中对数组进行排序
- python - Python - datetime.now() 返回不正确的时间
- sql - 对两个表进行分组并对结果 VBA ADODB SQL 查询执行左连接
- java - 线性布局java和xml
- sql-server - 根据 UPDATE 结果有条件地使用 SQL OUTPUT 子句
- hadoop - 是否可以将弹性搜索查询转换为可以在 hadoop 上应用相同过滤逻辑的东西?
- java - Do-While 循环用户输入不循环
- reactjs - setState 多变量的三元
- haskell - IHaskell 和 Latex
- java - 是否可以让 Jackson ObjectMapper 在 Spring Boot 应用程序中遵守 JAXB XML 注释?