首页 > 解决方案 > L1 regularization neural network in Pytorch does not yield sparse solution

问题描述

I'm implementing a neural network with l1 regularization in pytorch. I directly add the l1 norm penalty to the loss function. The framework is basically the same as Lack of Sparse Solution with L1 Regularization in Pytorch, however, the solution is not sparse no matter how I adjust the tuning parameter. How do I make the solution sparse?

My code is pasted below.

class NeuralNet(nn.Module):
    """
    neural network class, with nn api
    """
    def __init__(self, input_size: int, hidden_size: List[int], output_size: int):
        """
        initialization function
        @param input_size: input data dimension
        @param hidden_size: list of hidden layer sizes, arbitrary length
        @param output_size: output data dimension
        """
        super().__init__()
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.output_size = output_size
        self.relu = nn.ReLU()
        self.softmax = nn.Softmax(dim=1)
        """layers"""
        self.input = nn.Linear(self.input_size, self.hidden_size[0], bias=False)
        self.hiddens = nn.ModuleList([
            nn.Linear(self.hidden_size[h], self.hidden_size[h + 1]) for h in range(len(self.hidden_size) - 1)])
        self.output = nn.Linear(hidden_size[-1], output_size)

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        """
        forward propagation process, required by the nn.Module class
        @param x: the input data
        @return: the output from neural network
        """
        x = self.input(x)
        x = self.relu(x)
        for hidden in self.hiddens:
            x = hidden(x)
            x = self.relu(x)
        x = self.output(x)
        x = self.softmax(x)
        return x

def estimate(self, x, y, l1: bool = False, lam: float = None, learning_rate: float = 0.1, batch_size: int = 
    32, epochs: int = 50):
    """
    estimates the neural network model
    @param x: training data
    @param y: training label
    @param l1: whether to use l1 norm regularization
    @param lam: tuning parameter
    @param learning_rate: learning rate
    @param batch_size: batch size
    @param epochs: number of epochs
    @return: null
    """
    input_size = x.shape[1]
    hidden_size = [50, 30, 10]
    output_size = 2
    model = NeuralNet(input_size, hidden_size, output_size)
    optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
    trainset = []
    for i in range(x.shape[0]):
        trainset.append([x[i, :], y[i]])
    trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True)
    for e in range(epochs):
        running_loss = 0
        for data, label in trainloader:
            input_0 = data.view(data.shape[0], -1)
            optimizer.zero_grad()
            output = model(input_0.float())
            loss = torch.nn.CrossEntropyLoss()(output, label.squeeze(1))
            if l1:
                if lam is None:
                    raise ValueError("lam needs to be specified when l1 is True.")
                else:
                    for w in model.parameters():
                        if w.dim() > 1:
                            loss = loss + lam * w.norm(1)
            loss.backward()
            optimizer.step()
            running_loss += loss.item()

Doesn't pytorch support sparse solution?

标签: neural-networkpytorch

解决方案


The l1 regularization in pytorch does not handle the cases when the parameter will be shrunk to zero. You have to manually apply some soft-thresholding function like

$$s_{\lambda}(z)=sign(z)*(|z|-\lambda)_+$$

after each iteration to filter those parameters which are zero until convergence.


推荐阅读