首页 > 解决方案 > 为什么在这个问题中 4-pass 解决方案的性能比 1-pass 解决方案更快?

问题描述

问题指出:给定方向(U p、D own、L eft、R ight)String moves并从原点开始,最终结果会是原点吗?

一个明显的解决方案是简单地设置两个变量(用于跟踪垂直和水平移动),并查看在所有方向操作后它们是否变为零:

public boolean judgeCircle(String moves) {
    char[] chars = moves.toCharArray();
    int vertical = 0;
    int horizontal = 0;

    for(int i = 0; i < chars.length; i++){
        char c = chars[i];
        switch(c){
            case 'U':
                vertical ++;
                break;
            case 'D':
                vertical --;
                break;
            case 'L': 
                horizontal --;
                break;
            case 'R': 
                horizontal ++;
                break;
        }
    }
    return (vertical == 0) && (horizontal == 0);
}

对于 Leetcode 的测试用例,该算法在大约 8毫秒内找到了正确的解决方案。

但是,扫描所有操作 4 次的解决方案仅在~1ms内找到解决方案:

public boolean judgeCircle(String moves) {
 int x = charCount(moves, 'R') - charCount(moves, 'L');
    int y = charCount(moves, 'U') - charCount(moves, 'D');

    return x == 0 && y == 0; 
}

private int charCount(String moves, char c) {
    int count = 0;
    for(int i=0; i<moves.length(); i++) {
        if(moves.charAt(i) == c) {
            count++; 
        }
    }
    return count;   
}

我将这两种解决方案重复了大约 10 次,结果始终是一致的。但是,为什么扫描moves4 次的算法比只通过一次就可以找到答案的算法运行得更快(快 4 倍!)?

标签: javaalgorithm

解决方案


我调整了您的示例,以便所有算法都适用于相同的数据。我还在if实现中添加了另一个变体。

@State(Scope.Thread)
public class ForVsSwitch {
    private static final int MOVES_LENGTH = 1024;
    private static final char[] COMMANDS = { 'U', 'D', 'L', 'R'};

    private char[] moves;

    @Setup
    public void prepare(){
        Random random = new Random();
        moves = new char[MOVES_LENGTH];
        for(int i=0; i< MOVES_LENGTH; i++) {
            moves[i] = COMMANDS[random.nextInt(4)];
        }
    }

    @Benchmark
    @BenchmarkMode(Mode.SampleTime)
    @OutputTimeUnit(TimeUnit.MILLISECONDS)
    @Warmup(iterations = 3)
    @Measurement(iterations = 5)
    public void withSwitch() {
        judgeCircleWithSwitch(moves);
    }

    @Benchmark
    @BenchmarkMode(Mode.SampleTime)
    @OutputTimeUnit(TimeUnit.MILLISECONDS)
    @Warmup(iterations = 3)
    @Measurement(iterations = 5)
    public void withFor() {
        judgeCircleWithFor(moves);
    }

    @Benchmark
    @BenchmarkMode(Mode.SampleTime)
    @OutputTimeUnit(TimeUnit.MILLISECONDS)
    @Warmup(iterations = 3)
    @Measurement(iterations = 5)
    public void withIf() {
        judgeCircleWithIf(moves);
    }

    private boolean judgeCircleWithSwitch(char[] moves) {
        int vertical = 0;
        int horizontal = 0;

        for(int i = 0; i < moves.length; i++){
            char c = moves[i];
            switch(c){
                case 'U':
                    vertical ++;
                    break;
                case 'D':
                    vertical --;
                    break;
                case 'L':
                    horizontal --;
                    break;
                case 'R':
                    horizontal ++;
                    break;
            }
        }
        return (vertical == 0) && (horizontal == 0);
    }

    private boolean judgeCircleWithIf(char[] moves) {
        int vertical = 0;
        int horizontal = 0;

        for(int i = 0; i < moves.length; i++){
            char c = moves[i];
            if(c == 'U') {
                vertical++;
            } else if(c == 'D') {
                vertical--;
            } else if(c == 'L') {
                horizontal--;
            } else if(c == 'R') {
                horizontal ++;
            }
        }
        return (vertical == 0) && (horizontal == 0);
    }

    private boolean judgeCircleWithFor(char[] moves) {
        int x = charCount(moves, 'R') - charCount(moves, 'L');
        int y = charCount(moves, 'U') - charCount(moves, 'D');

        return x == 0 && y == 0;
    }

    private int charCount(char[] moves, char c) {
        int count = 0;
        for(int i=0; i<moves.length; i++) {
            if(moves[i] == c) {
                count++;
            }
        }
        return count;
    }
}

如果我正确阅读了结果,99.9% 的执行速度都快于 27 毫秒到 29 毫秒,对吧?算法之间似乎没有区别。

Benchmark                                    Mode      Cnt   Score    Error  Units
ForVsSwitch.withFor                        sample  5680658   0,003 ±  0,001  ms/op
ForVsSwitch.withFor:withFor·p0.00          sample            0,002           ms/op
ForVsSwitch.withFor:withFor·p0.50          sample            0,003           ms/op
ForVsSwitch.withFor:withFor·p0.90          sample            0,003           ms/op
ForVsSwitch.withFor:withFor·p0.95          sample            0,004           ms/op
ForVsSwitch.withFor:withFor·p0.99          sample            0,019           ms/op
ForVsSwitch.withFor:withFor·p0.999         sample            0,029           ms/op
ForVsSwitch.withFor:withFor·p0.9999        sample            0,075           ms/op
ForVsSwitch.withFor:withFor·p1.00          sample            2,912           ms/op
ForVsSwitch.withIf                         sample  8903669   0,002 ±  0,001  ms/op
ForVsSwitch.withIf:withIf·p0.00            sample            0,001           ms/op
ForVsSwitch.withIf:withIf·p0.50            sample            0,002           ms/op
ForVsSwitch.withIf:withIf·p0.90            sample            0,002           ms/op
ForVsSwitch.withIf:withIf·p0.95            sample            0,003           ms/op
ForVsSwitch.withIf:withIf·p0.99            sample            0,005           ms/op
ForVsSwitch.withIf:withIf·p0.999           sample            0,027           ms/op
ForVsSwitch.withIf:withIf·p0.9999          sample            0,051           ms/op
ForVsSwitch.withIf:withIf·p1.00            sample            5,202           ms/op
ForVsSwitch.withSwitch                     sample  8225249   0,002 ±  0,001  ms/op
ForVsSwitch.withSwitch:withSwitch·p0.00    sample            0,001           ms/op
ForVsSwitch.withSwitch:withSwitch·p0.50    sample            0,002           ms/op
ForVsSwitch.withSwitch:withSwitch·p0.90    sample            0,002           ms/op
ForVsSwitch.withSwitch:withSwitch·p0.95    sample            0,003           ms/op
ForVsSwitch.withSwitch:withSwitch·p0.99    sample            0,018           ms/op
ForVsSwitch.withSwitch:withSwitch·p0.999   sample            0,027           ms/op
ForVsSwitch.withSwitch:withSwitch·p0.9999  sample            0,071           ms/op
ForVsSwitch.withSwitch:withSwitch·p1.00    sample           22,610           ms/op

编辑:

我不能证实你的说法成立。我简化了这个例子。我使用静态列表作为两种算法的输入。我不做热身,只测量一次执行。正如预期的那样,4-pass 比 1-pass 更昂贵。我真的不知道你的网站在测量什么。

@State(Scope.Thread)
public class ForVsSwitch {
    private char[] moves = {'U', 'D', 'L', ...};

    @Benchmark
    @BenchmarkMode(Mode.SingleShotTime)
    @OutputTimeUnit(TimeUnit.MILLISECONDS)
    @Warmup(iterations = 0)
    @Measurement(iterations = 1, batchSize = 1)
    @Fork(value = 1, warmups = 0)
    public void withSwitch() {
        judgeCircleWithSwitch();
    }

    @Benchmark
    @BenchmarkMode(Mode.SingleShotTime)
    @OutputTimeUnit(TimeUnit.MILLISECONDS)
    @Warmup(iterations = 0)
    @Measurement(iterations = 1, batchSize = 1)
    @Fork(value = 1, warmups = 0)
    public void withFor() {
        judgeCircleWithFor();
    }

    private boolean judgeCircleWithSwitch() {
        int vertical = 0;
        int horizontal = 0;

        for(int i = 0; i < moves.length; i++){
            char c = moves[i];
            switch(c){
                case 'U':
                    vertical ++;
                    break;
                case 'D':
                    vertical --;
                    break;
                case 'L':
                    horizontal --;
                    break;
                case 'R':
                    horizontal ++;
                    break;
            }
        }
        return (vertical == 0) && (horizontal == 0);
    }

    private boolean judgeCircleWithFor() {
        int x = charCount(moves, 'R') - charCount(moves, 'L');
        int y = charCount(moves, 'U') - charCount(moves, 'D');

        return x == 0 && y == 0;
    }

    private int charCount(char[] moves, char c) {
        int count = 0;
        for(int i=0; i<moves.length; i++) {
            if(moves[i] == c) {
                count++;
            }
        }
        return count;
    }
}

for 循环比 switch 更昂贵。但正如其他评论中指出的那样,运行它一次并没有可靠的性能分析。

Benchmark               Mode  Cnt  Score   Error  Units
ForVsSwitch.withFor       ss       0,577          ms/op
ForVsSwitch.withSwitch    ss       0,241          ms/op

推荐阅读