python - Torch:为什么这个整理功能比另一个快得多?
问题描述
我已经开发了两个整理函数来从 h5py 文件中读取数据(我试图在这里为 MWE 创建一些合成数据,但它不会计划)。
两者在处理我的数据方面的差异大约是 10 倍——一个非常大的增长,我不确定为什么,我很好奇我对未来整理功能的见解。
def slow(batch):
'''
This function retrieves the data emitted from the H5 torch data set.
It alters the emitted dimensions from the dataloader
from: [batch_sz, layers, tokens, features], to:
[layers, batch_sz, tokens, features]
'''
embeddings = []
start_ids = []
end_ids = []
idxs = []
for i in range(len(batch)):
embeddings.append(batch[i]['embeddings'])
start_ids.append(batch[i]['start_ids'])
end_ids.append(batch[i]['end_ids'])
idxs.append(batch[i]['idx'])
# package data; # swap to expected [layers, batch_sz, tokens, features]
sample = {'embeddings': torch.as_tensor(embeddings).permute(1, 0, 2, 3),
'start_ids': torch.as_tensor(start_ids),
'end_ids': torch.as_tensor(end_ids),
'idx': torch.as_tensor(idxs)}
return sample
我认为下面的循环更多,会更慢,但事实并非如此。
def fast(batch):
''' This function alters the emitted dimensions from the dataloader
from: [batch_sz, layers, tokens, features]
to: [layers, batch_sz, tokens, features] for the embeddings
'''
# turn data to tensors
embeddings = torch.stack([torch.as_tensor(item['embeddings']) for item in batch])
# swap to expected [layers, batch_sz, tokens, features]
embeddings = embeddings.permute(1, 0, 2, 3)
# get start ids
start_ids = torch.stack([torch.as_tensor(item['start_ids']) for item in batch])
# get end ids
end_ids = torch.stack([torch.as_tensor(item['end_ids']) for item in batch])
# get idxs
idxs = torch.stack([torch.as_tensor(item['idx']) for item in batch])
# repackage
sample = {'embeddings': embeddings,
'start_ids': start_ids,
'end_ids': end_ids}
return sample
编辑:我尝试换成这个:与“快速”相比,它仍然慢了大约 10 倍。
def slow(batch):
'''
This function retrieves the data emitted from the H5 torch data set.
It alters the emitted dimensions from the dataloader
from: [batch_sz, layers, tokens, features], to:
[layers, batch_sz, tokens, features]
'''
embeddings = []
start_ids = []
end_ids = []
idxs = []
for item in batch:
embeddings.append(item['embeddings'])
start_ids.append(item['start_ids'])
end_ids.append(item['end_ids'])
idxs.append(item['idx'])
# package data; # swap to expected [layers, batch_sz, tokens, features]
sample = {'embeddings': torch.as_tensor(embeddings).permute(1, 0, 2, 3),
'start_ids': torch.as_tensor(start_ids),
'end_ids': torch.as_tensor(end_ids),
'idx': torch.as_tensor(idxs)}
return sample
解决方案
看到这个答案(并给它一个赞成票): https ://stackoverflow.com/a/30245465/10475762
特别是这行:“换句话说,总的来说,列表推导执行得更快,因为暂停和恢复一个函数的框架,或者在其他情况下是多个函数,比按需创建列表要慢。”
因此,在您的情况下,您在每次整理中多次调用 append ,这在您的训练/测试/评估步骤中被调用了很多次,所有这些都加起来。IMO,总是尽可能避免循环,因为它似乎总是会导致速度变慢。
推荐阅读
- android - coordinatorlayout 中的 recyclerview 无法正常工作
- apache-spark - PYSPARK - 在 partitionby() 之前使用 repartition() 将数据帧写入镶木地板文件时性能下降的解决方案?
- sql - 在 SQL 中查找字符串中的唯一尾随字符
- python - 在远程 SFTP 服务器上解压 tar.gz
- python - 如何将 NaN(在 pandas df Float 列中)显示为空单元格?
- php - 我无法汇总个别学生的数据
- algorithm - 6个nxn阶矩阵相乘的时间复杂度是多少?
- python - 为什么我需要在调用函数后分配这个变量?
- c# - .NET Core 2.2 Shared Cookie 在登录时导致 Bad Request 错误
- swift - 错误:尝试转换时无法将“Ninjumper.GameScene”类型的值转换为“SKSpriteNode”