首页 > 技术文章 > (原)torch中显示nn.Sequential()网络的详细情况

darkknightzh 2016-11-15 14:17 原文

转载请注明出处:

http://www.cnblogs.com/darkknightzh/p/6065526.html

本部分多试几次就可以弄得清每一层具体怎么访问了。

step1. 网络定义如下:

require "dpnn"
local net = nn.Sequential()
net:add(nn.SpatialConvolution(3, 64, 7, 7, 2, 2, 3, 3))
net:add(nn.SpatialBatchNormalization(64))
net:add(nn.ReLU())
net:add(nn.SpatialMaxPooling(3, 3, 2, 2, 1, 1))
net:add(nn.Inception{
     inputSize = 64,
     kernelSize = {3, 5},
     kernelStride = {1, 1},
     outputSize = {128, 32},
     reduceSize = {96, 16, 32, 64},
     pool = nn.SpatialMaxPooling(3, 3, 1, 1, 1, 1),
     batchNorm = true
   })
net:evaluate()

上面的网络,包含conv+BatchNorm+ReLU+Maxpool+Inception层。

step2. 直接通过print(net)便可得到其网络结构:

nn.Sequential {
  [input -> (1) -> (2) -> (3) -> (4) -> (5) -> output]
  (1): nn.SpatialConvolution(3 -> 64, 7x7, 2,2, 3,3)
  (2): nn.SpatialBatchNormalization
  (3): nn.ReLU
  (4): nn.SpatialMaxPooling(3x3, 2,2, 1,1)
  (5): nn.Inception @ nn.DepthConcat {
    input
      |`-> (1): nn.Sequential {
      |      [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> output]
      |      (1): nn.SpatialConvolution(64 -> 96, 1x1)
      |      (2): nn.SpatialBatchNormalization
      |      (3): nn.ReLU
      |      (4): nn.SpatialConvolution(96 -> 128, 3x3, 1,1, 1,1)
      |      (5): nn.SpatialBatchNormalization
      |      (6): nn.ReLU
      |    }
      |`-> (2): nn.Sequential {
      |      [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> output]
      |      (1): nn.SpatialConvolution(64 -> 16, 1x1)
      |      (2): nn.SpatialBatchNormalization
      |      (3): nn.ReLU
      |      (4): nn.SpatialConvolution(16 -> 32, 5x5, 1,1, 2,2)
      |      (5): nn.SpatialBatchNormalization
      |      (6): nn.ReLU
      |    }
      |`-> (3): nn.Sequential {
      |      [input -> (1) -> (2) -> (3) -> (4) -> output]
      |      (1): nn.SpatialMaxPooling(3x3, 1,1, 1,1)
      |      (2): nn.SpatialConvolution(64 -> 32, 1x1)
      |      (3): nn.SpatialBatchNormalization
      |      (4): nn.ReLU
      |    }
      |`-> (4): nn.Sequential {
             [input -> (1) -> (2) -> (3) -> output]
             (1): nn.SpatialConvolution(64 -> 64, 1x1)
             (2): nn.SpatialBatchNormalization
             (3): nn.ReLU
           }
       ... -> output
  }
}
View Code

但实际上该网络还包括input,output,gradInput等参数。

step3. 使用下面代码便可输出网络比较详细的参数:

for k,curLayer in pairs(net) do
    print(k,curLayer)
end

step4. 输出:

_type	torch.DoubleTensor	
output	[torch.DoubleTensor with no dimension]

gradInput	[torch.DoubleTensor with no dimension]

modules	{
  1 : 
    {
      dH : 2
      dW : 2
      nInputPlane : 3
      output : DoubleTensor - empty
      kH : 7
      train : false
      gradBias : DoubleTensor - size: 64
      padH : 3
      bias : DoubleTensor - size: 64
      weight : DoubleTensor - size: 64x3x7x7
      _type : "torch.DoubleTensor"
      gradWeight : DoubleTensor - size: 64x3x7x7
      padW : 3
      nOutputPlane : 64
      kW : 7
      gradInput : DoubleTensor - empty
    }
  2 : 
    {
      gradBias : DoubleTensor - size: 64
      output : DoubleTensor - empty
      gradInput : DoubleTensor - empty
      running_var : DoubleTensor - size: 64
      momentum : 0.1
      gradWeight : DoubleTensor - size: 64
      eps : 1e-05
      _type : "torch.DoubleTensor"
      affine : true
      running_mean : DoubleTensor - size: 64
      bias : DoubleTensor - size: 64
      weight : DoubleTensor - size: 64
      train : false
    }
  3 : 
    {
      inplace : false
      threshold : 0
      _type : "torch.DoubleTensor"
      output : DoubleTensor - empty
      gradInput : DoubleTensor - empty
      train : false
      val : 0
    }
  4 : 
    {
      dH : 2
      dW : 2
      kW : 3
      gradInput : DoubleTensor - empty
      indices : DoubleTensor - empty
      train : false
      _type : "torch.DoubleTensor"
      padH : 1
      ceil_mode : false
      output : DoubleTensor - empty
      kH : 3
      padW : 1
    }
  5 : 
    {
      outputSize : 
        {
          1 : 128
          2 : 32
        }
      inputSize : 64
      gradInput : DoubleTensor - empty
      modules : 
        {
          1 : 
            {
              train : false
              _type : "torch.DoubleTensor"
              output : DoubleTensor - empty
              gradInput : DoubleTensor - empty
              modules : 
                {
                  1 : {...}
                  2 : {...}
                  3 : {...}
                  4 : {...}
                }
              dimension : 2
              size : LongStorage - size: 0
            }
        }
      kernelStride : 
        {
          1 : 1
          2 : 1
        }
      _type : "torch.DoubleTensor"
      module : 
        {
          train : false
          _type : "torch.DoubleTensor"
          output : DoubleTensor - empty
          gradInput : DoubleTensor - empty
          modules : 
            {
              1 : 
                {
                  _type : "torch.DoubleTensor"
                  output : DoubleTensor - empty
                  gradInput : DoubleTensor - empty
                  modules : {...}
                  train : false
                }
              2 : 
                {
                  _type : "torch.DoubleTensor"
                  output : DoubleTensor - empty
                  gradInput : DoubleTensor - empty
                  modules : {...}
                  train : false
                }
              3 : 
                {
                  _type : "torch.DoubleTensor"
                  output : DoubleTensor - empty
                  gradInput : DoubleTensor - empty
                  modules : {...}
                  train : false
                }
              4 : 
                {
                  _type : "torch.DoubleTensor"
                  output : DoubleTensor - empty
                  gradInput : DoubleTensor - empty
                  modules : {...}
                  train : false
                }
            }
          dimension : 2
          size : LongStorage - size: 0
        }
      poolStride : 1
      padding : true
      reduceStride : {...}
      transfer : 
        {
          inplace : false
          threshold : 0
          _type : "torch.DoubleTensor"
          output : DoubleTensor - empty
          gradInput : DoubleTensor - empty
          val : 0
        }
      batchNorm : true
      train : false
      pool : 
        {
          dH : 1
          dW : 1
          kW : 3
          gradInput : DoubleTensor - empty
          indices : DoubleTensor - empty
          train : false
          _type : "torch.DoubleTensor"
          padH : 1
          ceil_mode : false
          output : DoubleTensor - empty
          kH : 3
          padW : 1
        }
      poolSize : 3
      reduceSize : 
        {
          1 : 96
          2 : 16
          3 : 32
          4 : 64
        }
      kernelSize : 
        {
          1 : 3
          2 : 5
        }
      output : DoubleTensor - empty
    }
}
train	false	
View Code

上面的modules中,分别为conv、BatchNorm、ReLU、Maxpool、Inception对应的参数。

step5. 可通过net.modules[1]来索引nn.SpatialConvolution。如print(net.modules[1])得到:

nn.SpatialConvolution(3 -> 64, 7x7, 2,2, 3,3)

step6. 如果想更进一步,输出该层的参数,可以使用如下代码(实际上step4中已经输出了):

for k,curLayer in pairs(net.modules[1]) do
    if type(curLayer) ~= 'userdata' then
        print(k,curLayer)
    else
        local strval = ' '
        for i = 1, curLayer:dim() do 
            strval = strval .. curLayer:size(i) .. " "
        end
        print(k .. " " .. type(curLayer) .. " " .. string.format("\27[31m size: %s", strval))
    end
end

step7. 得到的结果为:

dH	2	
dW	2	
nInputPlane	3	
output userdata  size:  	
kH	7	
train	false	
gradBias userdata  size:  64 	
padH	3	
bias userdata  size:  64 	
weight userdata  size:  64 3 7 7 	
_type	torch.DoubleTensor	
gradWeight userdata  size:  64 3 7 7 	
padW	3	
nOutputPlane	64	
kW	7	
gradInput userdata  size:
View Code

step8. 对于Inception层,step4中并没有完全显示出来。按照step5中的方式,使用net.modules[5]来得到Inception层。将step6进行更改,可输出:

outputSize	{
  1 : 128
  2 : 32
}
inputSize	64	
gradInput userdata  size:  	
modules	{
  1 : 
    {
      train : false
      _type : "torch.DoubleTensor"
      output : DoubleTensor - empty
      gradInput : DoubleTensor - empty
      modules : 
        {
          1 : 
            {
              _type : "torch.DoubleTensor"
              output : DoubleTensor - empty
              gradInput : DoubleTensor - empty
              modules : 
                {
                  1 : {...}
                  2 : {...}
                  3 : {...}
                  4 : {...}
                  5 : {...}
                  6 : {...}
                }
              train : false
            }
          2 : 
            {
              _type : "torch.DoubleTensor"
              output : DoubleTensor - empty
              gradInput : DoubleTensor - empty
              modules : 
                {
                  1 : {...}
                  2 : {...}
                  3 : {...}
                  4 : {...}
                  5 : {...}
                  6 : {...}
                }
              train : false
            }
          3 : 
            {
              _type : "torch.DoubleTensor"
              output : DoubleTensor - empty
              gradInput : DoubleTensor - empty
              modules : 
                {
                  1 : {...}
                  2 : {...}
                  3 : {...}
                  4 : {...}
                }
              train : false
            }
          4 : 
            {
              _type : "torch.DoubleTensor"
              output : DoubleTensor - empty
              gradInput : DoubleTensor - empty
              modules : 
                {
                  1 : {...}
                  2 : {...}
                  3 : {...}
                }
              train : false
            }
        }
      dimension : 2
      size : LongStorage - size: 0
    }
}
kernelStride	{
  1 : 1
  2 : 1
}
_type	torch.DoubleTensor	
module	nn.DepthConcat {
  input
    |`-> (1): nn.Sequential {
    |      [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> output]
    |      (1): nn.SpatialConvolution(64 -> 96, 1x1)
    |      (2): nn.SpatialBatchNormalization
    |      (3): nn.ReLU
    |      (4): nn.SpatialConvolution(96 -> 128, 3x3, 1,1, 1,1)
    |      (5): nn.SpatialBatchNormalization
    |      (6): nn.ReLU
    |    }
    |`-> (2): nn.Sequential {
    |      [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> output]
    |      (1): nn.SpatialConvolution(64 -> 16, 1x1)
    |      (2): nn.SpatialBatchNormalization
    |      (3): nn.ReLU
    |      (4): nn.SpatialConvolution(16 -> 32, 5x5, 1,1, 2,2)
    |      (5): nn.SpatialBatchNormalization
    |      (6): nn.ReLU
    |    }
    |`-> (3): nn.Sequential {
    |      [input -> (1) -> (2) -> (3) -> (4) -> output]
    |      (1): nn.SpatialMaxPooling(3x3, 1,1, 1,1)
    |      (2): nn.SpatialConvolution(64 -> 32, 1x1)
    |      (3): nn.SpatialBatchNormalization
    |      (4): nn.ReLU
    |    }
    |`-> (4): nn.Sequential {
           [input -> (1) -> (2) -> (3) -> output]
           (1): nn.SpatialConvolution(64 -> 64, 1x1)
           (2): nn.SpatialBatchNormalization
           (3): nn.ReLU
         }
     ... -> output
}
poolStride	1	
padding	true	
reduceStride	{}
transfer	nn.ReLU
batchNorm	true	
train	false	
pool	nn.SpatialMaxPooling(3x3, 1,1, 1,1)
poolSize	3	
reduceSize	{
  1 : 96
  2 : 16
  3 : 32
  4 : 64
}
kernelSize	{
  1 : 3
  2 : 5
}
output userdata  size: 
View Code

step9.step8中,modules中为对应的inception各层(3*3卷积,5*5卷积,pooling,1*1reduce)。可通过net.modules[5].module来得到这些层。该层也有train,output,gradInput,modules等变量。可通过print(net.modules[5].module)来输出。

step10. 根据step5中的思路,可通过net.modules[5].module.modules[1]来得到3*3卷基层具体情况:

_type	torch.DoubleTensor	
output userdata  size:  	
gradInput userdata  size:  	
modules	{
  1 : 
    {
      dH : 1
      dW : 1
      nInputPlane : 64
      output : DoubleTensor - empty
      kH : 1
      train : false
      gradBias : DoubleTensor - size: 96
      padH : 0
      bias : DoubleTensor - size: 96
      weight : DoubleTensor - size: 96x64x1x1
      _type : "torch.DoubleTensor"
      gradWeight : DoubleTensor - size: 96x64x1x1
      padW : 0
      nOutputPlane : 96
      kW : 1
      gradInput : DoubleTensor - empty
    }
  2 : 
    {
      gradBias : DoubleTensor - size: 96
      output : DoubleTensor - empty
      gradInput : DoubleTensor - empty
      running_var : DoubleTensor - size: 96
      momentum : 0.1
      gradWeight : DoubleTensor - size: 96
      eps : 1e-05
      _type : "torch.DoubleTensor"
      affine : true
      running_mean : DoubleTensor - size: 96
      bias : DoubleTensor - size: 96
      weight : DoubleTensor - size: 96
      train : false
    }
  3 : 
    {
      inplace : false
      threshold : 0
      _type : "torch.DoubleTensor"
      output : DoubleTensor - empty
      gradInput : DoubleTensor - empty
      train : false
      val : 0
    }
  4 : 
    {
      dH : 1
      dW : 1
      nInputPlane : 96
      output : DoubleTensor - empty
      kH : 3
      train : false
      gradBias : DoubleTensor - size: 128
      padH : 1
      bias : DoubleTensor - size: 128
      weight : DoubleTensor - size: 128x96x3x3
      _type : "torch.DoubleTensor"
      gradWeight : DoubleTensor - size: 128x96x3x3
      padW : 1
      nOutputPlane : 128
      kW : 3
      gradInput : DoubleTensor - empty
    }
  5 : 
    {
      gradBias : DoubleTensor - size: 128
      output : DoubleTensor - empty
      gradInput : DoubleTensor - empty
      running_var : DoubleTensor - size: 128
      momentum : 0.1
      gradWeight : DoubleTensor - size: 128
      eps : 1e-05
      _type : "torch.DoubleTensor"
      affine : true
      running_mean : DoubleTensor - size: 128
      bias : DoubleTensor - size: 128
      weight : DoubleTensor - size: 128
      train : false
    }
  6 : 
    {
      inplace : false
      threshold : 0
      _type : "torch.DoubleTensor"
      output : DoubleTensor - empty
      gradInput : DoubleTensor - empty
      train : false
      val : 0
    }
}
train	false	
View Code

注意:此处有一个module和一个modules,具体不太明白。

step11. 可通过net.modules[5].module.modules[1].modules进一步查看该层的情况:

1	nn.SpatialConvolution(64 -> 96, 1x1)
2	nn.SpatialBatchNormalization
3	nn.ReLU
4	nn.SpatialConvolution(96 -> 128, 3x3, 1,1, 1,1)
5	nn.SpatialBatchNormalization
6	nn.ReLU

可见,该层包括1*1conv,BatchNorm,ReLU,3*3conv,BatchNorm,Relu这些。

step12. 若要查看step11中的3*3卷基层信息,可使用如下索引:

net.modules[5].module.modules[1].modules[4]

结果如下:

dH	1	
dW	1	
nInputPlane	96	
output userdata  size:  	
kH	3	
train	false	
gradBias userdata  size:  128 	
padH	1	
bias userdata  size:  128 	
weight userdata  size:  128 96 3 3 	
_type	torch.DoubleTensor	
gradWeight userdata  size:  128 96 3 3 	
padW	1	
nOutputPlane	128	
kW	3	
gradInput userdata  size: 
View Code

step13. 到了step12,已经索引到了step1中网络的最深层。网络中每层均有input,output等。

step14. 对于net.modules[5]的Inception层,net.modules[5].output的结果和net.modules[5].module.output的结果是一样的,如(为方便显示,只显示了一小部分。如果输出net.modules[5].output,可能会有很多全为0的):

local imgBatch = torch.rand(1,3,128,128)
local infer = net:forward(imgBatch)

print(net.modules[5].output[1][2][3])
print(net.modules[5].module.output[1][2][3])

结果为:

0.01 *
 2.7396
 2.9070
 3.1895
 1.5040
 1.9784
 4.0125
 3.2874
 3.3137
 2.1326
 2.3930
 2.8170
 3.5226
 2.3162
 2.7308
 2.8511
 2.5278
 3.3325
 3.0819
 3.2826
 3.5363
 2.5749
 2.8816
 2.2393
 2.4765
 2.4803
 3.2553
 3.0837
 3.1197
 2.4632
 1.5145
 3.7101
 2.1888
[torch.DoubleTensor of size 32]

0.01 *
 2.7396
 2.9070
 3.1895
 1.5040
 1.9784
 4.0125
 3.2874
 3.3137
 2.1326
 2.3930
 2.8170
 3.5226
 2.3162
 2.7308
 2.8511
 2.5278
 3.3325
 3.0819
 3.2826
 3.5363
 2.5749
 2.8816
 2.2393
 2.4765
 2.4803
 3.2553
 3.0837
 3.1197
 2.4632
 1.5145
 3.7101
 2.1888
[torch.DoubleTensor of size 32]
View Code

 

推荐阅读