首页 > 解决方案 > 将列表转换为字典的最快方法,列表值作为键,列表索引作为值

问题描述

我们可以将列表转换为字典,将列表值设置为键,并将列表索引设置为值:

classes = ['car', 'bus', 'van']

reverse_classes = dict.fromkeys(classes)
for i, key in enumerate(reverse_classes):
    reverse_classes[key] = i

print(reverse_classes)

{'car': 0, 'bus': 1, 'van': 2}

问题是:这是最快的方法吗?

语境:

这用于在训练__getitem__时在自定义的实现中快速获取类的索引:torch.utils.data.Dataset

from torch.utils.data import Dataset

## Load all training data from https://github.com/PKU-IMRE/VERI-Wild, and set all images minus 1 per class for training, and keep the last one for testing
class VeRIWild(Dataset):
    def __init__(self, main_dir, transform, train=True, debug=False):
        self.root_dir = main_dir
        self.transform = transform
        self.train = train
        self.classes = natsort.natsorted(os.listdir(os.path.join(self.root_dir, 'images')))
        self.total_imgs = []
        for car in self.classes:
            imgs = natsort.natsorted(os.listdir(os.path.join(self.root_dir, 'images', car)))
            if train:
                for im in imgs[:-1]: # keep the last image for test
                    self.total_imgs.append(os.path.join(car, im))
            else:
                self.total_imgs.append(os.path.join(car, imgs[-1]))

        self.reverse_classes = dict.fromkeys(self.classes)
        for i, key in enumerate(self.reverse_classes):
            self.reverse_classes[key] = i
            

    def __len__(self):
        return len(self.total_imgs)

    ## Returns: Tuple (image, target) where target is the index of the target category.
    def __getitem__(self, idx):
        img_loc = os.path.join(self.root_dir, 'images', self.total_imgs[idx])
        image = Image.open(img_loc).convert("RGB")
        tensor_image = self.transform(image)
        car_name = os.path.dirname(self.total_imgs[idx])
        return (tensor_image, self.reverse_classes[car_name])

标签: python

解决方案


也许你可以试试

out = dict(zip(data, range(len(data))))

简单的基准测试(在更大的数据集上尝试):

from timeit import timeit
from itertools import count

# classes = ['car', 'bus', 'van']
classes = set('''Est voluptatum fuga natus ea officiis eveniet facere aut. Nihil eaque quia dolor officia. Et dolorem et aut laborum impedit accusantium consequatur. Atque tempora facilis iusto. Sit neque eligendi et accusantium et. Ut veritatis in voluptatum'''.split())

def f1(data):
    reverse_classes = {c: i for i, c in enumerate(data)}
    return reverse_classes

def f2(data):
    reverse_classes = dict.fromkeys(data)
    for i, key in enumerate(reverse_classes):
        reverse_classes[key] = i
    return reverse_classes

def f3(data):
    return dict(zip(data, range(len(data))))

def f4(data):
    return dict(zip(data, count()))


t1 = timeit(lambda: f1(classes), number=1000)
t2 = timeit(lambda: f2(classes), number=1000)
t3 = timeit(lambda: f3(classes), number=1000)
t4 = timeit(lambda: f4(classes), number=1000)

print(t1)
print(t2)
print(t3)
print(t4)

印刷:

0.006092605064623058
0.007285483996383846
0.004913415992632508
0.0048480971017852426

编辑:添加版本itertools.count(感谢@HeapOverflow)


推荐阅读