首页 > 解决方案 > 快速查找字典和对系列熊猫字典地图系列

问题描述

我有一个距离矩阵 A == > direct ==> B... Z

A == > 通过 ALPHA ==> B...Z

B == > 直接 ==> C..Z

我创建了一个字典,其工作方式如下:

#distances is populated with the distance value above
distances = pd.DataFrame.from_dict({ 'From' : ['A','A','A','B','B','C','C'],
                                  'via': ['d','s','d','d','d','d','s'],
                                  'To' : ['B','C','D','C','D','E','F']
                                  'Distance': [10,5,12,4,3,22,21]})
distances_dict = distances.set_index(['From', 'via', 'To']).to_dict('index')
new_distances = dict()
for key in distances_dict.keys():
        new_distances.update({key: distances_dict[key]['Distance']})
print(new_distances['A', 'd', 'B'])

我有一个 pandas df(1,000,000 行),我正在尝试计算每行的距离,但我将使用与上面相同的方法来简化。

a = distances
a['map'] = "'"+a['From']+"'"+",'"+a['via']+"',"+"'"+a['To']+"'"
a['Check Distance'] = a['map'].map(new_distances)
#yields NaN

有没有办法做到这一点?我正在查看相对大规模的字符串查找

标签: pythonpandasdictionaryserieslookup-tables

解决方案


你能试试这个吗?

a['Check Distance'] = a.apply(lambda x: distances_dict[(x['From'], x['via'], x['To'])]['Distance'],axis=1)


推荐阅读