首页 > 解决方案 > 为什么带有 @tf.function 的 TensorFlow 2.x 的速度是 pytorch 的两倍?

问题描述

在小型模型或小型数据集上,带有 @tf.function 的 tf2 是 pytorch 的两倍

为什么 Pytorch 在小型数据集/模型上比 Tensorflow 慢?我设置错了吗?有什么方法可以加快pytorch?

TF2版

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
import numpy as np
import time
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
dims = 600
DATA_SIZE = 10000
batchsz = 100
epochs = 10
lr=1e-3
inputs = np.random.random([DATA_SIZE, dims]).astype(np.float32)
targets =np.random.random([DATA_SIZE, dims]).astype(np.float32)
tic_total=time.time()
model=keras.Sequential([layers.Dense(dims),
                        layers.ReLU(),
                        layers.Dense(dims),
                        layers.ReLU(),
                        layers.Dense(dims),
                        layers.ReLU(),
                        layers.Dense(dims),
                        layers.ReLU(),
                        layers.Dense(dims),
                        layers.ReLU(),
                        layers.Dense(dims)])


train_db = tf.data.Dataset.from_tensor_slices((inputs,targets))  # [w*h,C]
train_db = train_db.shuffle(DATA_SIZE).batch(batchsz)
optimizer = tf.optimizers.Adam(lr)

@tf.function
def bp():
    with tf.GradientTape() as tape:
        loss = tf.reduce_mean(tf.square(model(data[0]) - data[1]))
    grads = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(grads, model.trainable_variables))

for epoch in range(epochs):
    tic_epoch=time.time()
    for step, data in enumerate(train_db):
        bp()
    toc_epoch = time.time()
    print('time per epoch:', toc_epoch - tic_epoch)
toc_total = time.time()
print('time elapsed:',toc_total-tic_total)

结果:

time per epoch: 1.2558062076568604
time per epoch: 0.2627861499786377
time per epoch: 0.25072169303894043
time per epoch: 0.250835657119751
time per epoch: 0.2861459255218506
time per epoch: 0.26630401611328125
time per epoch: 0.24439024925231934
time per epoch: 0.23915791511535645
time per epoch: 0.25443601608276367
time per epoch: 0.2899332046508789
time elapsed: 4.2830810546875

Pytorch 版本

import numpy as np
import torch.utils
import torch.utils.data
from torch import nn
import time

dims = 600
DATA_SIZE = 10000
batchsz = 100
epochs = 10
lr = 1e-3
inputs = np.random.random([DATA_SIZE, dims]).astype(np.float32)
targets = np.random.random([DATA_SIZE, dims]).astype(np.float32)
tic_total = time.time()
model = nn.Sequential(
    nn.Linear(dims, dims),
    nn.ReLU(),
    nn.Linear(dims, dims),
    nn.ReLU(),
    nn.Linear(dims, dims),
    nn.ReLU(),
    nn.Linear(dims, dims),
    nn.ReLU(),
    nn.Linear(dims, dims),
    nn.ReLU(),
    nn.Linear(dims, dims)
).cuda()

train_x = torch.tensor(inputs).cuda()
train_y = torch.tensor(targets).cuda()

train_db = torch.utils.data.TensorDataset(train_x, train_y)
train_db = torch.utils.data.DataLoader(train_db, batch_size=batchsz, shuffle=True)
optimizer = torch.optim.Adam(model.parameters(), lr=lr)
model.train()
for epoch in range(epochs):
    tic_epoch = time.time()
    for step, data in enumerate(train_db):
        loss = (model(data[0]) - data[1]).square().mean()
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    toc_epoch = time.time()
    print('time per epoch:', toc_epoch - tic_epoch)
toc_total = time.time()
print('time elapsed:', toc_total - tic_total)

结果:

time per epoch: 1.1492640972137451
time per epoch: 0.4739706516265869
time per epoch: 0.4949944019317627
time per epoch: 0.48531651496887207
time per epoch: 0.4747319221496582
time per epoch: 0.4962129592895508
time per epoch: 0.6802644729614258
time per epoch: 0.5593955516815186
time per epoch: 0.4838066101074219
time per epoch: 0.4937736988067627
time elapsed: 8.071050882339478

标签: pythondeep-learningpytorchtensorflow2.0

解决方案


推荐阅读