首页 > 解决方案 > 将函数应用于列 df 的每一行

问题描述

我想计算列的每一行的欧几里得距离。欧几里得距离计算两个坐标之间的距离。

def measure_distance(x,y):
    p1 = np.array([651700.453,4767830.552])
    p2 = np.array([651701.446,4767831.971])
    d=np.linalg.norm(p2-p1)

df['Desired Output/Distance between coords'] = df.apply(lambda row : add(measure_distance['A'], axis = 1)

似乎不起作用

我想将上述功能应用于 df

Coordinate  Coordinate     Desired Output/Distance between coords
    x           y   
651243.933  4766822.602 
651258.583  4766826.795    15.23823313
651261.454  4766827.617    2.986356476
651266.262  4766828.988    7.986005885
651269.14   4766829.809    2.992812223
651285.448  4766834.461    16.95853673
651298.459  4766838.172    13.5298796
651329.205  4766846.942    31.97232266
651334.422  4766848.43     5.425056037

标签: pythonpandasnumpy

解决方案


将 'x' 和 'y' 转换为数组 '[x, y]' 然后转移以计算差异。最后,应用规范:

out = df[['x', 'y']].apply(np.array, axis=1)
df['dist'] = out.sub(out.shift()).apply(np.linalg.norm)

# OR (without intermediate variable)

df['dist'] = df[['x', 'y']].sub(df[['x', 'y']].shift()) \
                           .apply(tuple, axis=1).apply(np.linalg.norm)

输出:

>>> df
            x            y       dist
0  651243.933  4766822.602        NaN
1  651258.583  4766826.795  15.238233
2  651261.454  4766827.617   2.986356
3  651266.262  4766828.988   4.999650
4  651269.140  4766829.809   2.992812
5  651285.448  4766834.461  16.958537
6  651298.459  4766838.172  13.529880
7  651329.205  4766846.942  31.972323
8  651334.422  4766848.430   5.425056

推荐阅读