首页 > 解决方案 > 关闭编译器优化?我用于评估算法顺序的 C# 代码返回 logN 或 N^3 而不是 N 用于简单循环

问题描述

作为学习练习,我正在编写一个库来评估算法的复杂性。我通过查看算法对给定输入 N 花费多长时间来做到这一点,然后进行多项式回归以查看算法是否最适合 N、N^2、log(N) 等。我编写了单元测试用例,它们似乎适用于 N^2 和 LogN。这是最简单的情况,N 让我很伤心。对于 N 阶算法,我使用以下内容:

uint LinearAlgorithm2(uint n)
    {
        uint returnValue = 7;
        for (uint i = 0; i < n; i++)
        {
            //Thread.Sleep(2);
            double y = _randomNumber.NextDouble(); // dummy calculation
            if (y < 0.0005)
            {
                returnValue = 1;
                //Console.WriteLine("y " + y + i);
            }
            else if (y < .05)
            {
                returnValue = 2;
            }
            else if (y < .5)
            {
                returnValue = 3;
            }
            else
            {
                returnValue = 7;
            }

        }
        return returnValue;
    }

我在里面放了那么多无意义的代码,只是因为我担心编译器可能一直在优化我的循环。无论如何,我认为循环只是一个从 0 到 n 的简单循环,因此这是一个算法或 N 阶。

我的单元测试代码是:

public void TestLinearAlgorithm2()
    {
        Evaluator evaluator = new Evaluator();
        var result = evaluator.Evaluate(LinearAlgorithm2, new List<double>() { 1000,1021, 1065, 1300, 1423, 1599,
            1683, 1691, 1692, 1696, 1699, 1705,1709, 1712, 1713, 1717, 1720,
            1722, 1822, 2000, 2050, 2090, 2500, 2666, 2700,2701, 2767, 2799, 2822, 2877,
            3000, 3100, 3109, 3112, 3117, 3200, 3211, 3216, 3219, 3232, 3500, 3666, 3777,
            4000, 4022, 4089, 4122, 4199, 4202, 4222, 5000 });
        var minKey = result.Aggregate((l, r) => l.Value < r.Value ? l : r).Key;
        Assert.IsTrue(minKey.ToString() == FunctionEnum.N.ToString());
    }

我把类 Evaluator 放在下面。也许在凝视之前,虽然我会问

1)你是否同意一个简单的循环 0 到 N 应该是 N 阶的复杂性?即完成算法的时间增加为 n(不是 nLogn 或 n^3 等)

2) 是否已经编写了一些库代码来评估算法复杂性?

3)我非常怀疑问题是优化之一。在 Visual Studio 中的 ProjectSettings->Build 下,我已取消选中“优化代码”。我还应该做什么?我怀疑编译器会歪曲结果的一个原因是我打印了 n 的各种输入值的时间。对于 1000(第一个条目),它是 2533,但对于 1021,它只有 415!我已将所有结果放在 Evaluator 类下面。

感谢您的任何想法,让我知道是否可以提供更多信息(Github 链接?)-Dave

Evaluator.cs 的代码

using MathNet.Numerics.LinearAlgebra;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

/// <summary>
/// Library to evaluate complexity of algorithm.
/// Pass in method and necessary data
/// There are methods to set the size of the test data
/// 
/// Evaluate for LogN, N, NLogN, N^2, N^3, 2^N
/// 
/// Should be able use ideas from
/// https://en.wikipedia.org/wiki/Polynomial_regression
/// to finish problem.  Next need matrix multiplication.
/// Or possibly use this:
/// https://www.codeproject.com/Articles/19032/C-Matrix-Library
/// or similar
/// </summary>
namespace BigOEstimator
{
    public enum FunctionEnum
    {
        Constant = 0,
        LogN = 1,
        N,
        NLogN,
        NSquared,
        NCubed,
        TwoToTheN
    }
    public class Evaluator
    {
        //private List<uint> _suggestedList = new List<uint>();
        private Dictionary<FunctionEnum, double> _results = new Dictionary<FunctionEnum, double>(); 
        public Evaluator()
        {

        }

        public Dictionary<FunctionEnum, double> Evaluate(Func<uint,uint> algorithm, IList<double> suggestedList)
        {
            Dictionary<FunctionEnum, double> results = new Dictionary<FunctionEnum, double>();
            Vector<double> answer = Vector<double>.Build.Dense(suggestedList.Count(), 0.0);
            for (int i = 0; i < suggestedList.Count(); i++)
            {
                Stopwatch stopwatch = new Stopwatch();
                stopwatch.Start();
                var result = algorithm((uint) suggestedList[i]);
                stopwatch.Stop();
                answer[i] = stopwatch.ElapsedTicks;

                Console.WriteLine($"Answer for index {suggestedList[i]} is {answer[i]}");
            }

            // linear case - N
            results[FunctionEnum.N] = CalculateResidual(Vector<double>.Build.DenseOfEnumerable(suggestedList), answer, d => d);
            // quadratic case - NSquared
            results[FunctionEnum.NSquared] = CalculateResidual(Vector<double>.Build.DenseOfEnumerable(suggestedList), answer, d => (d*d));
            // cubic case - NCubed
            results[FunctionEnum.NCubed] = CalculateResidual(Vector<double>.Build.DenseOfEnumerable(suggestedList), answer, d => (d * d * d));
            // NLogN case - NLogN
            results[FunctionEnum.NLogN] = CalculateResidual(Vector<double>.Build.DenseOfEnumerable(suggestedList), answer, d => (d * Math.Log(d)));
            // LogN case - LogN
            results[FunctionEnum.LogN] = CalculateResidual(Vector<double>.Build.DenseOfEnumerable(suggestedList), answer, d => ( Math.Log(d)));

            // following few lines are useful for unit tests. You get this by hitting 'Output' on test!
            var minKey = results.Aggregate((l, r) => l.Value < r.Value ? l : r).Key;
            Console.WriteLine("Minimum Value: Key: " + minKey.ToString() + ", Value: " + results[minKey]);
            foreach (var item in results)
            {
                Console.WriteLine("Test: " + item.Key + ", result: " + item.Value);
            }
            return results;
        }

        private double CalculateResidual(Vector<double> actualXs, Vector<double> actualYs, Func<double, double> transform)
        {

            Matrix<double> m = Matrix<double>.Build.Dense(actualXs.Count, 2, 0.0);
            for (int i = 0; i < m.RowCount; i++)
            {
                m[i, 0] = 1.0;
                m[i, 1] = transform((double)actualXs[i]);
            }
            Vector<double> betas = CalculateBetas(m, actualYs);
            Vector<double> estimatedYs = CalculateEstimatedYs(m, betas);
            return CalculatateSumOfResidualsSquared(actualYs, estimatedYs);

        }
        private double CalculateLinearResidual(Vector<double> actualXs, Vector<double> actualYs)
        {
            Matrix<double> m = Matrix<double>.Build.Dense(actualXs.Count, 2, 0.0);
            for (int i = 0; i < m.RowCount; i++)
            {
                m[i, 0] = 1.0;
                m[i, 1] = (double)actualXs[i];
            }
            Vector<double> betas = CalculateBetas(m, actualYs);
            Vector<double> estimatedYs = CalculateEstimatedYs(m, betas);
            return CalculatateSumOfResidualsSquared(actualYs, estimatedYs);
        }
        private Vector<double> CalculateBetas(Matrix<double> m, Vector<double> y)
        {
            return (m.Transpose() * m).Inverse() * m.Transpose() * y;
        }

        private Vector<double> CalculateEstimatedYs(Matrix<double> x, Vector<double> beta)
        {
            return x * beta;
        }

        private double CalculatateSumOfResidualsSquared(Vector<double> yReal, Vector<double> yEstimated)
        {
            return ((yReal - yEstimated).PointwisePower(2)).Sum();
        }


    }
}

一次单元测试的结果(注意差异,例如第一个!):

 Answer for index 1000 is 2533
Answer for index 1021 is 415
Answer for index 1065 is 375
Answer for index 1300 is 450
Answer for index 1423 is 494
Answer for index 1599 is 566
Answer for index 1683 is 427
Answer for index 1691 is 419
Answer for index 1692 is 413
Answer for index 1696 is 420
Answer for index 1699 is 420
Answer for index 1705 is 438
Answer for index 1709 is 595
Answer for index 1712 is 588
Answer for index 1713 is 426
Answer for index 1717 is 433
Answer for index 1720 is 421
Answer for index 1722 is 428
Answer for index 1822 is 453
Answer for index 2000 is 497
Answer for index 2050 is 518
Answer for index 2090 is 509
Answer for index 2500 is 617
Answer for index 2666 is 653
Answer for index 2700 is 673
Answer for index 2701 is 671
Answer for index 2767 is 690
Answer for index 2799 is 685
Answer for index 2822 is 723
Answer for index 2877 is 714
Answer for index 3000 is 746
Answer for index 3100 is 753
Answer for index 3109 is 754
Answer for index 3112 is 763
Answer for index 3117 is 2024
Answer for index 3200 is 772
Answer for index 3211 is 821
Answer for index 3216 is 802
Answer for index 3219 is 788
Answer for index 3232 is 775
Answer for index 3500 is 848
Answer for index 3666 is 896
Answer for index 3777 is 917
Answer for index 4000 is 976
Answer for index 4022 is 972
Answer for index 4089 is 1145
Answer for index 4122 is 1047
Answer for index 4199 is 1031
Answer for index 4202 is 1033
Answer for index 4222 is 1151
Answer for index 5000 is 1588
Minimum Value: Key: NCubed, Value: 5895501.06936747
Test: N, result: 6386524.27502984
Test: NSquared, result: 6024667.62732316
Test: NCubed, result: 5895501.06936747
Test: NLogN, result: 6332154.89282043
Test: LogN, result: 6969133.89207915

标签: c#algorithmperformancetime-complexitybig-o

解决方案


我怀疑您的根本问题是每个单独迭代的运行时间是如此之低,以至于您无法控制的其他因素(线程调度、缓存未命中等)正在导致显着的每次操作差异并主导执行时间。对于真正的 N^3 算法,相对较少的 N 仍然可以产生相当多的“循环”,这意味着操作成本的变化有机会平均下来。对于直接 O(N) 甚至 O(log(N)) 的事物,个体操作方差成为较小 N 的问题。

为了解决这个问题,您需要运行有效的算法以进行更多迭代,以使这些其他效果有时间平均化。这可能意味着必须在低 N 时评估您的初始结果,如果您发现它没有花费足够的时间变得有意义,则必须以不同的速率对其进行缩放。您可能希望扩大到需要整整几秒钟才能获得良好平均的范围,但您必须进行试验以确定仍然存在多少差异。


推荐阅读