首页 > 解决方案 > 如何找到数据框中每个点相对于另一个数据框中所有点的最小距离?

问题描述

我有两个带有景点坐标的数据框并且存在。

import pandas as pd
import geopy
from geopy.distance import geodesic
attr = pd.DataFrame(
    {'attraction':['circuit', 'roller coaster'],
    'latitude':[53.35923, 53.35958],
    'longitude':[83.71394, 83.71256]})
exits = pd.DataFrame(
    {'exits':['exit','exit2','exit3', 'exit4'],
    'latitude':[53.35911, 53.3606, 53.35953, 53.3603],
    'longitude':[83.71503, 83.71407, 83.71154, 83.71216]})

  attraction        latitude    longitude
0 circuit           53.35923    83.71394
1 roller coaster    53.35958    83.71256

  exits     latitude    longitude
0 exit      53.35911    83.71503
1 exit2     53.36060    83.71407
2 exit3     53.35953    83.71154
3 exit4     53.36030    83.71216

我想在第一个数据框中添加一个与最近出口(最小距离)的距离的列。它必须看起来像:

  attraction        min_distance
0 circuit           73.789480
1 roller coaster    68.137324

我有一个代码,但它远非理想,我想知道如何让它更容易

def distance(row, name, i):
    return exits[name][[i]]
attr['d_to_e0_1'] = attr.apply(lambda r: distance(r, 'latitude', 0), axis=1)
attr['d_to_e0_2'] = attr.apply(lambda r: distance(r, 'longitude', 0), axis=1)
attr['d_to_e1_1'] = attr.apply(lambda r: distance(r, 'latitude', 1), axis=1)
attr['d_to_e1_2'] = attr.apply(lambda r: distance(r, 'longitude', 1), axis=1)
attr['d_to_e2_1'] = attr.apply(lambda r: distance(r, 'latitude', 2), axis=1)
attr['d_to_e2_2'] = attr.apply(lambda r: distance(r, 'longitude', 2), axis=1)
attr['d_to_e3_1'] = attr.apply(lambda r: distance(r, 'latitude', 3), axis=1)
attr['d_to_e3_2'] = attr.apply(lambda r: distance(r, 'longitude', 3), axis=1)


def distance(row):
    return geopy.distance.distance((row.latitude, row.longitude), (row.d_to_e0_1, row.d_to_e0_2)).m
attr['dist_to_e0'] = attr.apply(lambda r: distance(r), axis=1)

def distance(row):
    return geopy.distance.distance((row.latitude, row.longitude), (row.d_to_e1_1, row.d_to_e1_2)).m
attr['dist_to_e1'] = attr.apply(lambda r: distance(r), axis=1)

def distance(row):
    return geopy.distance.distance((row.latitude, row.longitude), (row.d_to_e2_1, row.d_to_e2_2)).m
attr['dist_to_e2'] = attr.apply(lambda r: distance(r), axis=1)

def distance(row):
    return geopy.distance.distance((row.latitude, row.longitude), (row.d_to_e3_1, row.d_to_e3_2)).m
attr['dist_to_e3'] = attr.apply(lambda r: distance(r), axis=1)

def min_distance(row):
    return min (row.dist_to_e0, row.dist_to_e1, row.dist_to_e2, row.dist_to_e3)
attr['min_distance'] = attr.apply(lambda r: min_distance(r), axis=1)

attr[['attraction', 'min_distance']].copy()

标签: pythonpandasdistanceminimum

解决方案


我做了以下事情:

for att in attr.index:
    distances = []
    for ex in exits.index:
        distances.append(geopy.distance.distance(attr.loc[att, ['latitude','longitude']], exits.loc[ex,['latitude','longitude']]))
    min_dist = min(distances)
    attr.loc[att, 'min_distance'] = min_dist

attr

    attraction      latitude    longitude   min_distance
0   circuit         53.35923    83.71394    0.07378947966924111 km
1   roller coaster  53.35958    83.71256    0.06813732373534863 km

推荐阅读