python - PyTorch 相同输入不同输出(非随机)
问题描述
我有一个我训练过的模型,我正在通过运行它(在.eval()
模式下)来测试它。
以下是我在调试器中执行的确切行和顺序:
(Pdb) p feature
tensor([[[ -4.0563, -3.8415, -3.9542, ..., -14.8424, -14.9201, -14.8960],
[ -5.8481, -2.0405, -2.4438, ..., -19.6938, -19.4901, -19.9180],
[ -5.2424, -1.2804, -1.5109, ..., -19.3892, -19.4397, -19.5012],
...,
[ -6.4756, -2.0376, -2.0894, ..., -20.0942, -19.9635, -19.8762],
[ -6.5087, -2.0452, -1.9018, ..., -19.7127, -19.8574, -20.0103],
[ -7.0725, -4.2817, -3.3231, ..., -16.7170, -16.9004, -17.0333]]])
(Pdb) p feature2
tensor([[[ -4.0563, -3.8415, -3.9542, ..., -14.8424, -14.9201, -14.8960],
[ -5.8481, -2.0405, -2.4438, ..., -19.6938, -19.4901, -19.9180],
[ -5.2424, -1.2804, -1.5109, ..., -19.3892, -19.4397, -19.5012],
...,
[ -6.4756, -2.0376, -2.0894, ..., -20.0942, -19.9635, -19.8762],
[ -6.5087, -2.0452, -1.9018, ..., -19.7127, -19.8574, -20.0103],
[ -7.0725, -4.2817, -3.3231, ..., -16.7170, -16.9004, -17.0333]]])
(Pdb) torch.all(feature == feature2)
tensor(True)
(Pdb) prediction_tag, prediction_time = model(feature)
(Pdb) prediction_tag2, prediction_time2 = model(feature2)
(Pdb) prediction_time
tensor([[[9.6584e-06, 3.9059e-05, 4.0984e-06, ..., 1.7644e-04,
1.0589e-02, 4.4167e-06],
[9.6584e-06, 3.9059e-05, 4.0984e-06, ..., 1.7644e-04,
1.0589e-02, 4.4167e-06],
[9.3993e-06, 3.7754e-05, 3.9786e-06, ..., 1.7362e-04,
1.0243e-02, 4.2382e-06],
...,
[7.8885e-06, 1.1077e-05, 3.8594e-06, ..., 1.9443e-04,
3.8032e-03, 6.6878e-06],
[8.0696e-06, 1.1217e-05, 3.9580e-06, ..., 2.0004e-04,
3.7598e-03, 6.8072e-06],
[8.0696e-06, 1.1217e-05, 3.9580e-06, ..., 2.0004e-04,
3.7598e-03, 6.8072e-06]]])
(Pdb) p prediction_time2
tensor([[[8.0289e-07, 2.0557e-05, 2.5803e-05, ..., 3.3225e-04,
4.4547e-03, 8.4192e-06],
[8.0289e-07, 2.0557e-05, 2.5803e-05, ..., 3.3225e-04,
4.4547e-03, 8.4192e-06],
[7.6509e-07, 1.9805e-05, 2.4918e-05, ..., 3.2385e-04,
4.3618e-03, 7.9963e-06],
...,
[7.3927e-07, 8.7688e-06, 1.8454e-05, ..., 1.9831e-04,
1.9305e-03, 6.2879e-06],
[7.7376e-07, 8.8673e-06, 1.8517e-05, ..., 2.0194e-04,
1.8297e-03, 6.3183e-06],
[7.7376e-07, 8.8673e-06, 1.8517e-05, ..., 2.0194e-04,
1.8297e-03, 6.3183e-06]]])
(Pdb) torch.all(prediction_time == prediction_time2)
tensor(False)
如您所见,即使feature
和feature2
看似相同的输入,模型的输出也不匹配。这也不是随机的,因为在我执行了上面的这些行并运行下面的这些行之后:
(Pdb) prediction_tag, prediction_time = model(feature)
(Pdb) prediction_time
tensor([[[9.6584e-06, 3.9059e-05, 4.0984e-06, ..., 1.7644e-04,
1.0589e-02, 4.4167e-06],
[9.6584e-06, 3.9059e-05, 4.0984e-06, ..., 1.7644e-04,
1.0589e-02, 4.4167e-06],
[9.3993e-06, 3.7754e-05, 3.9786e-06, ..., 1.7362e-04,
1.0243e-02, 4.2382e-06],
...,
[7.8885e-06, 1.1077e-05, 3.8594e-06, ..., 1.9443e-04,
3.8032e-03, 6.6878e-06],
[8.0696e-06, 1.1217e-05, 3.9580e-06, ..., 2.0004e-04,
3.7598e-03, 6.8072e-06],
[8.0696e-06, 1.1217e-05, 3.9580e-06, ..., 2.0004e-04,
3.7598e-03, 6.8072e-06]]])
(Pdb) prediction_tag2, prediction_time2 = model(feature2)
(Pdb) prediction_time2
tensor([[[8.0289e-07, 2.0557e-05, 2.5803e-05, ..., 3.3225e-04,
4.4547e-03, 8.4192e-06],
[8.0289e-07, 2.0557e-05, 2.5803e-05, ..., 3.3225e-04,
4.4547e-03, 8.4192e-06],
[7.6509e-07, 1.9805e-05, 2.4918e-05, ..., 3.2385e-04,
4.3618e-03, 7.9963e-06],
...,
[7.3927e-07, 8.7688e-06, 1.8454e-05, ..., 1.9831e-04,
1.9305e-03, 6.2879e-06],
[7.7376e-07, 8.8673e-06, 1.8517e-05, ..., 2.0194e-04,
1.8297e-03, 6.3183e-06],
[7.7376e-07, 8.8673e-06, 1.8517e-05, ..., 2.0194e-04,
1.8297e-03, 6.3183e-06]]])
我得到相同的,不同的输出。为什么我会遇到这个问题?我完全糊涂了。
注意:我已经检查了两者feature
并feature2
具有torch.float32
. feature
是从设置的torch DataLoader中提取的,而feature2
直接从读取文件中获取。
编辑:这是模型的构建方式:
class CRNN(nn.Module):
def __init__(self, inputdim, outputdim, **kwargs):
super().__init__()
features = nn.ModuleList()
self.features = nn.Sequential(
Block2D(1, 32),
nn.LPPool2d(4, (2, 4)),
Block2D(32, 128),
Block2D(128, 128),
nn.LPPool2d(4, (2, 4)),
Block2D(128, 128),
Block2D(128, 128),
nn.LPPool2d(4, (1, 4)),
nn.Dropout(0.3),
)
with torch.no_grad():
rnn_input_dim = self.features(torch.randn(1, 1, 500,
inputdim)).shape
rnn_input_dim = rnn_input_dim[1] * rnn_input_dim[-1]
self.gru = nn.GRU(rnn_input_dim,
128,
bidirectional=True,
batch_first=True)
self.temp_pool = parse_poolingfunction(kwargs.get(
'temppool', 'linear'),
inputdim=256,
outputdim=outputdim)
self.outputlayer = nn.Linear(256, outputdim)
self.features.apply(init_weights)
self.outputlayer.apply(init_weights)
def forward(self, x):
batch, time, dim = x.shape
x = x.unsqueeze(1)
x = self.features(x)
x = x.transpose(1, 2).contiguous().flatten(-2)
x, _ = self.gru(x)
decision_time = torch.sigmoid(self.outputlayer(x)).clamp(1e-7, 1.)
decision_time = torch.nn.functional.interpolate(
decision_time.transpose(1, 2),
time,
mode='linear',
align_corners=False).transpose(1, 2)
decision = self.temp_pool(x, decision_time).clamp(1e-7, 1.).squeeze(1)
return decision, decision_time
def crnn(inputdim=64, outputdim=527, pretrained_file='gpv_f'):
model = CRNN(inputdim, outputdim)
if pretrained_file:
state = torch.load(Path(__file__).parent / pretrained_file,
map_location='cpu')
model.load_state_dict(state, strict=True)
return model
使用以下助手:
class Block2D(nn.Module):
def __init__(self, cin, cout, kernel_size=3, padding=1):
super().__init__()
self.block = nn.Sequential(
nn.BatchNorm2d(cin),
nn.Conv2d(cin,
cout,
kernel_size=kernel_size,
padding=padding,
bias=False),
nn.LeakyReLU(inplace=True, negative_slope=0.1))
def forward(self, x):
return self.block(x)
def init_weights(m):
if isinstance(m, (nn.Conv2d, nn.Conv1d)):
nn.init.kaiming_normal_(m.weight)
if m.bias is not None:
nn.init.constant_(m.bias, 0)
elif isinstance(m, nn.BatchNorm2d):
nn.init.constant_(m.weight, 1)
if m.bias is not None:
nn.init.constant_(m.bias, 0)
if isinstance(m, nn.Linear):
nn.init.kaiming_uniform_(m.weight)
if m.bias is not None:
nn.init.constant_(m.bias, 0)
class LinearSoftPool(nn.Module):
"""LinearSoftPool
Linear softmax, takes logits and returns a probability, near to the actual maximum value.
Taken from the paper:
A Comparison of Five Multiple Instance Learning Pooling Functions for Sound Event Detection with Weak Labeling
https://arxiv.org/abs/1810.09050
"""
def __init__(self, pooldim=1):
super().__init__()
self.pooldim = pooldim
def forward(self, logits, time_decision):
return (time_decision**2).sum(self.pooldim) / time_decision.sum(
self.pooldim)
class MeanPool(nn.Module):
def __init__(self, pooldim=1):
super().__init__()
self.pooldim = pooldim
def forward(self, logits, decision):
return torch.mean(decision, dim=self.pooldim)
def parse_poolingfunction(poolingfunction_name='mean', **kwargs):
"""parse_poolingfunction
A heler function to parse any temporal pooling
Pooling is done on dimension 1
:param poolingfunction_name:
:param **kwargs:
"""
poolingfunction_name = poolingfunction_name.lower()
if poolingfunction_name == 'mean':
return MeanPool(pooldim=1)
elif poolingfunction_name == 'linear':
return LinearSoftPool(pooldim=1)
解决方案
没有模型很难说。一般来说,您应该始终遵循 pytorch 的REPRODUCIBILITY指南,因此请尝试设置torch.manual_seed(0)
,np.random.seed(0)
如果您在每次执行之前在某处使用 numpy 并设置
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
在一开始的时候。看看这是否改变了什么。
推荐阅读
- ruby-on-rails - Vue-Multiselect显示对象而不是“名称”字段
- html - 输入电话号码验证模式
- javascript - 添加css以将表格格式化为网格
- javascript - 如何让onclick显示php文件的内容
- python - 如何创建表并选择数据库中的任何列?
- javascript - 想要使用 IDE (NetBeans) 实现声音文件,但引用的是实际文件而不是 URL
- zsh - ZSH 自定义提示
- rasa-nlu - Rasa 聊天机器人可以发起对话吗?
- java - 当线程执行时,CPU 内部到底发生了什么?
- sqlite - 使用 SQFlite (Flutter) 的一对多关系