首页 > 解决方案 > 权重矩阵最终全连接层

问题描述

我的问题是,我认为,太简单了,但这让我很头疼。我认为我在神经网络中遗漏了一些概念上的东西,或者 Tensorflow 返回了一些错误的层。

我有一个网络,其中最后一层输出 4800 个单位。倒数第二层有 2000 个单位。我希望最后一层的权重矩阵具有 (4800, 2000) 的形状,但是当我在 Tensorflow 中打印出形状时,我看到了 (2000, 4800)。请有人可以确认最后一层应该具有哪种形状的权重矩阵?根据答案,我可以进一步调试问题。谢谢。

标签: python-3.xtensorflowconv-neural-network

解决方案


Conceptually, a neural network layer is often written like y = W*x where * is matrix multiplication, x is an input vector and y an output vector. If x has 2000 units and y 4800, then indeed W should have size (4800, 2000), i.e. 4800 rows and 2000 columns.

However, in implementations we usually work on a batch of inputs X. Say X is (b, 2000) where b is your batch size. We don't want to transform each element of X individually by doing W*x as above since this would be inefficient.
Instead we would like to transform all inputs at the same time. This can be done via Y = X*W.T where W.T is the transpose of W. You can work out that this essentially applies W*x to each row of X (i.e. each input). Y is then a (b, 4800) matrix containing all transformed inputs.

In Tensorflow, the weight matrix is simply saved in this transposed state, since it is usually the form that is needed anyway. Thus, we have a matrix with shape (2000, 4800) (the shape of W.T).


推荐阅读