python - 在使用 numpy 调试浅层神经网络时需要帮助
问题描述
我正在动手学习,并使用 numpy 在 python 中创建了一个模型,该模型正在接受来自 sklearn 库的乳腺癌数据集的训练。模型运行没有任何错误,训练和测试精度分别为 92.48826291079813% 和 90.9090909090909%。但是不知何故,我无法完成动手操作,因为(可能)我的结果与预期不同。我不知道问题出在哪里,因为我不知道正确的答案,也没有看到任何错误。
会要求有人帮助我解决这个问题。代码如下。
#Import numpy as np and pandas as pd
"""
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_breast_cancer
**Define method initialiseNetwork() initilise weights with zeros of shape(num_features, 1) and also bias b to zero
parameters: num_features(number of input features)
returns : dictionary of weight vector and bias**
def initialiseNetwork(num_features):
W = np.zeros((num_features,1))
b = 0
parameters = {"W": W, "b": b}
return parameters
** define function sigmoid for the input z.
parameters: z
returns: $1/(1+e^{(-z)})$ **
def sigmoid(z):
a = 1/(1 + np.exp(-z))
return a
** Define method forwardPropagation() which implements forward propagtion defined as Z = (W.T dot_product X) + b, A = sigmoid(Z)
parameters: X, parameters
returns: A **
def forwardPropagation(X, parameters):
W = parameters["W"]
b = parameters["b"]
Z = np.dot(W.T,X) + b
A = sigmoid(Z)
return A
** Define function cost() which calculate the cost given by −(sum(Y\*log(A)+(1−Y)\*log(1−A)))/num_samples, here * is elementwise product
parameters: A,Y,num_samples(number of samples)
returns: cost **
def cost(A, Y, num_samples):
cost = -1/num_samples * np.sum(Y*np.log(A) + (1-Y)*(np.log(1-A)))
#cost = Y*np.log(A) + (1-Y)*(np.log(1-A))
return cost
** Define method backPropgation() to get the derivatives of weigths and bias
parameters: X,Y,A,num_samples
returns: dW,db **
def backPropagration(X, Y, A, num_samples):
dZ = A - Y
dW = (np.dot(X,dZ.T))/num_samples #(X dot_product dZ.T)/num_samples
db = np.sum(dZ)/num_samples #sum(dZ)/num_samples
return dW, db
** Define function updateParameters() to update current parameters with its derivatives
w = w - learning_rate \* dw
b = b - learning_rate \* db
parameters: parameters,dW,db, learning_rate
returns: dictionary of updated parameters **
def updateParameters(parameters, dW, db, learning_rate):
W = parameters["W"] - (learning_rate * dW)
b = parameters["b"] - (learning_rate * db)
return {"W": W, "b": b}
** Define the model for forward propagation
parameters: X,Y, num_iter(number of iterations), learning_rate
returns: parameters(dictionary of updated weights and bias) **
def model(X, Y, num_iter, learning_rate):
num_features = X.shape[0]
num_samples = X.shape[1]
parameters = initialiseNetwork(num_features) #call initialiseNetwork()
for i in range(num_iter):
#A = forwardPropagation(X, Y, parameters) # calculate final output A from forwardPropagation()
A = forwardPropagation(X, parameters)
if(i%100 == 0):
print("cost after {} iteration: {}".format(i, cost(A, Y, num_samples)))
dW, db = backPropagration(X, Y, A, num_samples) # calculate derivatives from backpropagation
parameters = updateParameters(parameters, dW, db, learning_rate) # update parameters
return parameters
** Run the below cell to define the function to predict the output.It takes updated parameters and input data as function parameters and returns the predicted output **
def predict(X, parameters):
W = parameters["W"]
b = parameters["b"]
b = b.reshape(b.shape[0],1)
Z = np.dot(W.T,X) + b
Y = np.array([1 if y > 0.5 else 0 for y in sigmoid(Z[0])]).reshape(1,len(Z[0]))
return Y
** The code in the below cell loads the breast cancer data set from sklearn.
The input variable(X_cancer) is about the dimensions of tumor cell and targrt variable(y_cancer) classifies tumor as malignant(0) or benign(1) **
(X_cancer, y_cancer) = load_breast_cancer(return_X_y = True)
** Split the data into train and test set using train_test_split(). Set the random state to 25. Refer the code snippet in topic 4 **
X_train, X_test, y_train, y_test = train_test_split(X_cancer, y_cancer,
random_state = 25)
** Since the dimensions of tumor is not uniform you need to normalize the data before feeding to the network
The below function is used to normalize the input data. **
def normalize(data):
col_max = np.max(data, axis = 0)
col_min = np.min(data, axis = 0)
return np.divide(data - col_min, col_max - col_min)
** Normalize X_train and X_test and assign it to X_train_n and X_test_n respectively **
X_train_n = normalize(X_train)
X_test_n = normalize(X_test)
** Transpose X_train_n and X_test_n so that rows represents features and column represents the samples
Reshape Y_train and y_test into row vector whose length is equal to number of samples.Use np.reshape() **
X_trainT = X_train_n.T
#print(X_trainT.shape)
X_testT = X_test_n.T
#print(X_testT.shape)
y_trainT = y_train.reshape(1,X_trainT.shape[1])
y_testT = y_test.reshape(1,X_testT.shape[1])
** Train the network using X_trainT,y_trainT with number of iterations 4000 and learning rate 0.75 **
parameters = model(X_trainT, y_trainT, 4000, 0.75) #call the model() function with parametrs mentioned in the above cell
** Predict the output of test and train data using X_trainT and X_testT using predict() method> Use the parametes returned from the trained model **
yPredTrain = predict(X_trainT, parameters) # pass weigths and bias from parameters dictionary and X_trainT as input to the function
yPredTest = predict(X_testT, parameters) # pass the same parameters but X_testT as input data
** Run the below cell print the accuracy of model on train and test data. ***
accuracy_train = 100 - np.mean(np.abs(yPredTrain - y_trainT)) * 100
accuracy_test = 100 - np.mean(np.abs(yPredTest - y_testT)) * 100
print("train accuracy: {} %".format(accuracy_train))
print("test accuracy: {} %".format(accuracy_test))
我的输出:训练准确度:92.48826291079813 % 测试准确度:90.9090909090909 %
解决方案
我弄清楚问题出在哪里。这是预测功能的第三行,我正在重塑偏见,这根本没有必要。
def predict(X, parameters):
W = parameters["W"]
b = parameters["b"]
**b = b.reshape(b.shape[0],1)**
Z = np.dot(W.T,X) + b
Y = np.array([1 if y > 0.5 else 0 for y in sigmoid(Z[0])]).reshape(1,len(Z[0]))
return Y
并且需要将反向传播函数中的第三行更正为 np.sum(dZ)/num_samples。
def backPropagration(X, Y, A, num_samples):
dZ = A - Y
dW = (np.dot(X,dZ.T))/num_samples
** db = sum(dZ)/num_samples **
return dW, db
在我纠正了这两个函数后,模型给我的训练准确率为 98.59154929577464%,测试准确率为 93.00699300699301%。
推荐阅读
- php - 使用 Carbon 仅解析时间
- javascript - 如何在 react-native 中跨所有组件/屏幕或全局监听 socket.io 事件?
- google-sheets - Google 表格:如何离线打开文件
- python - python从apsx下载图片,验证码OCR
- javascript - 在 wavesurfer.js 和 Web Audio API 处理后下载结果 mp3 文件
- python - python日期str解析
- python - 我想通过迁移学习创建回归模型,但出现错误
- android - 我的颤振应用程序仅在调试模式下运行(USB 电缆)并且在我的设备中看不到它
- r - 无法连接到 rstudio 服务器
- c# - “图像”不包含“图像”的定义