首页 > 解决方案 > 计算三种均值的欧几里得距离

问题描述

我在计算欧几里得距离时遇到了麻烦。

后来我参考了这个函数,它给了我这个错误:

TypeError:输入类型不支持 ufunc 'bitwise_and',并且根据转换规则 ''safe'' 无法安全地将输入强制转换为任何支持的类型

硬编码的 K-means 算法需要它。

def euclideanDist(df,pointIDX,mean_1,mean_2,mean_3):

    point = df.iloc[pointIDX][['Shoe_Size','Height']].values
    mean_1 = mean_1[['Shoe_Size','Height']].values
    mean_2 = mean_2[['Shoe_Size','Height']].values
    mean_3 = mean_3[['Shoe_Size','Height']].values

    dist_Total_1 = sum([a-b for a,b in zip(point,mean_1)])**2
    dist_Total_2 = sum([a-b for a,b in zip(point,mean_2)])**2
    dist_Total_3 = sum([a-b for a,b in zip(point,mean_3)])**2

    if dist_Total_1 < dist_Total_2 & dist_Total_3: 
        df.loc[pointIDX,'class'] = 1

    elif dist_Total_2 < dist_Total_3 > dist_Total_1:
        df.loc[pointIDX, "class"] = 2

    else:
        df.loc[pointIDX,'class'] = 3

    return df

标签: pythonpandasdataframe

解决方案


你这里有一些语法问题

if dist_Total_1 < dist_Total_2 & dist_Total_3: 
    df.loc[pointIDX,'class'] = 1

elif dist_Total_2 < dist_Total_3 > dist_Total_1:
    df.loc[pointIDX, "class"] = 2

我相信你真正想要的是

if dist_Total_1 < dist_Total_2 and dist_Total_1 < dist_Total_3: 
    df.loc[pointIDX,'class'] = 1

elif dist_Total_2 < dist_Total_3 and dist_Total_2 < dist_Total_1:
    df.loc[pointIDX, "class"] = 2

您的距离计算似乎也不符合我对欧几里得距离的理解。也许这反而

dist_Total_1 = sum([(a-b)**2 for a,b in zip(point,mean_1)])**0.5

等等dist_Total_2dist_Total_3


推荐阅读