首页 > 解决方案 > 如何将 L1 正则化添加到单层感知器网络?

问题描述

我正在努力理解如何在我的单层感知器网络中实现 L1 正则化。以及当与 MSE 作为损失一起使用时,L1 如何影响权重变化。重量变化如下:

在此处输入图像描述

但我不明白如何推导上述函数......下面是我网络的代码,非常感谢任何帮助!

# Train a single layer perceptron input -> output

# Define average weight update matrix
tau = 0.01
a_n = np.zeros((n_epoch, n_output_layer, n_input_layer))

for i in range(0, n_epoch):
    # Initialise the gradients for each batch
    dW1 = np.zeros(W1.shape)
    
    # Shuffle the order of samples each epoch
    shuffled_idxs = np.random.permutation(n_samples)
    
    for batch in range(0, n_batches):
        # Initalise the gradient matrix
        dW1 = np.zeros(W1.shape)
        # Initalise the bias matrix
        dbias_W1 = np.zeros(bias_W1.shape)
        
        # Loop over each sample in the batch
        for j in range(0, batch_size):
            # Input (random element from the dataset)
            idx = shuffled_idxs[batch*batch_size + j]
            x0 = x_train[idx]

            # Form the desired output, the correct neuron should have 1 the rest 0
            desired_output = y_train[idx]

            # Neural activation: input layer -> hidden layer 
            h1 = np.dot(W1, x0) + bias_W1

            # Apply ReLU 1
            x1 = relu(h1)
            
            # Compute the error signal
            e_n = desired_output - x1

            # Backpropagation: output layer -> input layer
            delta1 = grad_relu(x1) * e_n
            
            # Compute the change in weight and bias
            dW1 += np.outer(delta1, x0)
            dbias_W1 += delta1

            # Store the error per epoch
            errors[i] = errors[i] + 0.5 * np.sum(np.square(e_n))/n_samples
            
        # After each batch update the weights using accumulated gradients
        W1 += eta*dW1/batch_size
        
        dW = eta * dW1 / batch_size
        
        # Exponenital moving average to show convergence
        if i == 0:
            a_n[i] = dW
        else:
            a_n[i] = a_n[i-1] * (1 - tau) + (tau * dW)
            
        # Update the bias
        bias_W1 += eta*dbias_W1/batch_size
        
    print( "Epoch ", i+1, ": error = ", errors[i])
            
            
            
        

标签: neural-networkperceptronregularized

解决方案


推荐阅读