首页 > 解决方案 > 如何:数组内余弦相似度的平方根~java~

问题描述

我的问题是我正在创建一个书籍推荐系统,当我尝试对平方进行平方以确定相似性时。我不相信它是每个数组的所有内容的平方根。

提示用户使用这二十本书,然后根据他们喜欢这本书的程度输入范围从“1-5”的答案,如果他们没有读过这本书,则输入“-1”。

我的一些分数输出是 NaN。因此我假设它只是在数组的第一个元素之后停止。

我尝试过重新排列循环,我个人认为这是循环以及它如何访问数组的问题。

这是 CPU 评级文件。

-1 1 1 4 1 3 3 1 2 3 4 -1 4 1 2 4 5 4 2 3
3 -1 2 3 -1 2 5 -1 3 3 5 2 2 1 2 3 5 3 4 2
-1 1 -1 4 1 3 5 2 1 5 3 -1 5 2 1 3 4 5 3 2
-1 -1 3 2 -1 5 5 2 2 4 4 2 3 2 -1 3 4 4 3 1
2 1 1 5 2 2 4 2 3 4 3 -1 5 2 2 5 3 5 2 1
3 -1 3 4 -1 2 5 -1 -1 4 3 -1 3 -1 2 5 5 5 4 2
4 -1 4 2 3 -1 1 3 4 -1 1 4 4 4 -1 2 -1 1 4 4
4 3 3 3 -1 2 2 4 3 -1 2 4 3 4 2 -1 -1 2 2 3
3 -1 3 -1 3 4 -1 5 5 -1 -1 -1 1 -1 -1 1 1 2 -1 5
3 -1 3 4 3 4 -1 5 5 2 3 3 4 1 1 -1 -1 -1 -1 4
4 -1 4 4 1 3 -1 5 4 -1 1 3 4 1 -1 1 -1 1 -1 5
5 -1 3 1 4 3 -1 5 4 1 3 2 1 -1 4 2 1 -1 2 4
3 -1 5 1 4 4 2 5 5 1 2 3 1 1 -1 1 -1 1 -1 5
4 1 5 4 3 -1 1 3 4 -1 -1 3 3 -1 1 1 2 -1 3 5
-1 1 1 3 -1 3 1 3 -1 -1 3 -1 5 2 2 1 4 -1 5 -1
3 -1 2 3 1 5 4 3 3 -1 5 -1 5 2 -1 4 4 3 3 3
1 1 1 3 2 4 1 -1 -1 -1 5 -1 3 -1 -1 1 -1 2 5 2
-1 2 3 5 -1 4 3 1 1 3 3 -1 4 -1 -1 4 3 2 5 1
-1 1 3 3 -1 3 3 1 -1 -1 3 -1 5 -1 -1 3 1 2 4 -1
3 -1 2 4 1 4 3 -1 2 3 4 1 3 -1 2 -1 4 3 5 -1
-1 1 3 5 -1 4 2 1 -1 3 3 2 3 2 -1 3 1 -1 3 -1
3 2 2 3 -1 5 -1 -1 2 3 4 -1 4 1 -1 -1 -1 -1 4 2
-1 3 -1 -1 4 -1 2 -1 2 2 2 5 -1 3 4 -1 -1 2 -1 2
1 4 3 -1 3 2 1 -1 -1 -1 1 3 1 3 3 1 -1 -1 -1 3
4 3 3 -1 4 2 -1 4 -1 -1 2 4 -1 3 4 2 -1 -1 -1 4
-1 5 1 -1 4 1 -1 3 2 2 -1 4 1 3 3 1 -1 -1 -1 3
-1 4 2 1 5 -1 -1 2 1 1 -1 5 -1 5 4 1 2 2 -1 1
2 5 2 -1 3 -1 -1 1 -1 2 -1 4 2 4 3 -1 2 1 -1 -1
2 5 1 1 4 -1 2 1 -1 -1 2 4 -1 3 4 2 -1 -1 -1 4

平方根的方法

        public static double sqrtSquares(double []A) {

            //check A for -1
        double sum = 0;
                for(int i = 0; i<A.length; i++) {
                    if(A[i] < 0 ) {
                        A[i] = 0;
                    }

                    A[i] = Math.sqrt(A[i]);

                    //calculate the running sum;
                    sum += A[i] * A[i] ;
                }
        return Math.sqrt(sum);
        }


    public static double similarity(double []A, double []B) {
        double sum = 0;
        double p1 = sqrtSquares(A);
        double p2 = sqrtSquares(B);

        for (int i=0; i<A.length; i++) {
            if (A[i]> 0) {
                if (B[i]> 0) {
                    sum += A[i]*B[i];
                }
            }

        }
    return sum/(p1*p2);
    }

这是主要的相似度评分方法

        double []scores = new double[30];

        for(int i = 0; i< 30; i++) {
            scores[i] = similarity(yourrating, pplratings[i]);
        }
        for(int k = 0; k <scores.length; k++) {
            System.out.println("SCORES ["+ k + "] "+scores[k]);
        }
            return scores;
    }

在方法的最后,它打印出两个数组检索到的 30 个分数。以下是错误结果

SCORES [0] 0.8345932239467343
SCORES [1] 0.8930284538287845
SCORES [2] 0.8859571865530889
SCORES [3] 0.8885782312086968
SCORES [4] 0.8775173350115371
SCORES [5] 0.9443223415026459
SCORES [6] 0.8250453876017286
SCORES [7] 0.8432290780758503
SCORES [8] 0.8862288358972311
SCORES [9] 0.7131697319344704
SCORES [10] 0.8182594818515688
SCORES [11] 0.8009904274635006
SCORES [12] 0.8637068116707501
SCORES [13] 0.8507371827482269
SCORES [14] 0.8370334932826162
SCORES [15] 0.775738787468209
SCORES [16] 0.880315376993314
SCORES [17] 0.7702419338621114
SCORES [18] 0.841428935139835
SCORES [19] 0.7527243233023518
SCORES [20] 0.8474342113753683
SCORES [21] 0.815084547094269
SCORES [22] 0.7592956404693546
SCORES [23] 0.7303452808509205
SCORES [24] 0.7808981699861455
SCORES [25] 0.7676319325573738
SCORES [26] 0.7782147276497292
SCORES [27] 0.7962287074180334
SCORES [28] 0.7538710355467405
SCORES [29] 0.7795507063811014

编辑:此代码现在有效。谢谢大家的帮助。

标签: javaarraysnancosine-similarity

解决方案


        public static double sqrtSquares(double []A) {
            double sum = 0;
            for(int i = 0; i<A.length; i++) {
                if(A[i] < 0 ) {
                    A[i] = 0;
                }
                sum += A[i]*A[i];    // calculate the running sum of squares
            }
            return Math.sqrt(sum);
        }

基于余弦相似度定义:https ://en.wikipedia.org/wiki/Cosine_similarity


推荐阅读