algorithm - 给定运行时数据,如何知道排序程序是使用冒泡排序还是插入排序?
问题描述
我测量了一个排序程序/算法,并根据运行时数据,我将其缩小为两种排序算法——冒泡排序和插入排序。
有没有办法确定它是哪一个?当然不知道代码。
它们都具有相同的时间复杂度,我没有想法。
时间复杂度数据:
- 排序:O(n) 1000 个数字所用的时间 = 0.0047 秒
- 随机未排序:O(n^2) 1000 个数字所用的时间 = 0.021s
- 降序:O(n^2) 1000 个数字所用的时间 = 0.04s
提前致谢!
解决方案
您的 1000 个排序元素太少
测量的时间太短而不能代表有效的测量(因为大多数时间可能不会被排序本身使用,而是初始化窗口、打开文件等......)。
您需要至少 100 毫秒或更长的时间(1 秒是理想的)。
如果您有权访问正在排序的数据
您可以引入对每种类型的排序都具有挑战性的数据集(并且从时间推断使用的算法)......例如,对于以相反顺序排序的数组来说,冒泡排序是最慢的......所以通过升序和降序和随机的排序数据并比较时间。让我们打电话给时间
tasc,tdes,trnd
并假设升序排序,那么如果涉及冒泡排序,它应该是:tasc O(n) < trnd < tdes O(n^2)
所以:
tasc*n == tdes + margin_of error
所以只是测试
tdes/tasc
接近n
......有一些错误余地......所以你只需要创建一个样本数据,它对于特定类型的排序而不是其他类型的排序很难......并且从时间开始检测是否是这种情况,直到你发现使用了算法。
这里有一些数据(所有时间都在
[ms]
)我在我的冒泡排序和 asc 排序数据上测试过:n tasc tdesc tasc*n 1000 0.00321 2.96147 3.205750 2000 0.00609 11.76799 12.181855 4000 0.01186 45.58834 47.445111
如果我们有复杂性的运行时更清楚
O(n)
t(O(n)) = c*n
转换为具有复杂性的运行时
O(n^2)
(假设相同的常量 timec
):t(O(n^2)) = c*n*n = t(O(n)) * n
通过这种方式,您可以比较具有不同复杂性的时间,您只需将所有测量的时间转换为单个常见的复杂性......
如果您可以选择排序的数据大小
然后正如评论中提到的那样,您可以推断时间的增长率随着增加
n
(加倍)的增加,您可以估计复杂性并从中可以看出使用了哪种算法。因此,让我们假设从#2开始测量的时间,那么对于 tasc ( )
O(n)
,恒定时间c
应该是相同的O(n)
:n tasc c=tasc/n 1000 0.00321 0.000003210 2000 0.00609 0.000003045 4000 0.01186 0.000002965
对于 tdesc (
O(n^2)
):n tdesc tdesc/n^2 1000 2.96147 0.00000296147000 2000 11.76799 0.00000294199750 4000 45.58834 0.00000284927125
正如您所看到的
c
,这两次都或多或少相同,tasc,tdesc
这意味着它们符合估计的复杂性O(n),O(n^2)
然而,如果不知道被测试的应用程序做了什么,就很难确定排序之前可能会进行处理......例如,可能会扫描数据以检测表格(排序的、随机的、几乎排序的......),这是可行O(n)
的结果以及数据大小可能会选择使用哪种排序算法...因此您的测量可能会测量不同的例程,从而使结果无效...
[edit1] 我有一个疯狂的想法,即自动检测复杂性
只需通过测试常数时间常数在所有测量时间与它们相应的时间之间是否或多或少相同n
......这里简单的C++/VCL代码:
//$$---- Form CPP ----
//---------------------------------------------------------------------------
#include <vcl.h>
#include <math.h>
#pragma hdrstop
#include "Unit1.h"
//---------------------------------------------------------------------------
#pragma package(smart_init)
#pragma resource "*.dfm"
TForm1 *Form1;
//---------------------------------------------------------------------------
double factorial[]= // n[-],t[ms]
{
11,0.008,
12,0.012,
13,0.013,
14,0.014,
15,0.016,
16,0.014,
17,0.015,
18,0.017,
19,0.019,
20,0.016,
21,0.017,
22,0.019,
23,0.021,
24,0.023,
25,0.025,
26,0.027,
27,0.029,
28,0.032,
29,0.034,
30,0.037,
31,0.039,
32,0.034,
33,0.037,
34,0.039,
35,0.041,
36,0.039,
37,0.041,
38,0.044,
39,0.046,
40,0.041,
41,0.044,
42,0.046,
43,0.049,
44,0.048,
45,0.050,
46,0.054,
47,0.056,
48,0.056,
49,0.060,
50,0.063,
51,0.066,
52,0.065,
53,0.069,
54,0.072,
55,0.076,
56,0.077,
57,0.162,
58,0.095,
59,0.093,
60,0.089,
61,0.093,
62,0.098,
63,0.096,
64,0.090,
65,0.100,
66,0.104,
67,0.111,
68,0.100,
69,0.121,
70,0.109,
71,0.119,
72,0.104,
73,0.124,
74,0.113,
75,0.118,
76,0.118,
77,0.123,
78,0.129,
79,0.133,
80,0.121,
81,0.119,
82,0.131,
83,0.150,
84,0.141,
85,0.148,
86,0.154,
87,0.163,
88,0.211,
89,0.151,
90,0.157,
91,0.166,
92,0.161,
93,0.169,
94,0.173,
95,0.188,
96,0.181,
97,0.187,
98,0.194,
99,0.201,
100,0.185,
101,0.191,
102,0.202,
103,0.207,
104,0.242,
105,0.210,
106,0.215,
107,0.221,
108,0.217,
109,0.226,
110,0.232,
111,0.240,
112,0.213,
113,0.231,
114,0.240,
115,0.252,
116,0.248,
117,0.598,
118,0.259,
119,0.261,
120,0.254,
121,0.263,
122,0.270,
123,0.281,
124,0.290,
125,0.322,
126,0.303,
127,0.313,
128,0.307,
0,0.000
};
//---------------------------------------------------------------------------
double sort_asc[]=
{
1000,0.00321,
2000,0.00609,
4000,0.01186,
0,0.000
};
//---------------------------------------------------------------------------
double sort_desc[]=
{
1000, 2.96147,
2000,11.76799,
4000,45.58834,
0,0.000
};
//---------------------------------------------------------------------------
double sort_rand[]=
{
1000, 3.205750,
2000,12.181855,
4000,47.445111,
0,0.000
};
//---------------------------------------------------------------------------
double div(double a,double b){ return (fabs(b)>1e-10)?a/b:0.0; }
//---------------------------------------------------------------------------
AnsiString get_complexity(double *dat) // expect dat[] = { n0,t(n0), n1,t(n1), ... , 0,0 }
{
AnsiString O="O(?)";
int i,e;
double t,n,c,c0,c1,a,dc=1e+10;
#define testbeg for (e=1,i=0;dat[i]>0.5;){ n=dat[i]; i++; t=dat[i]; i++;
#define testend(s) if ((c<=0.0)||(n<2.0)) continue; if (e){ e=0; c0=c; c1=c; } if (c0>c) c0=c; if (c1<c) c1=c; } a=fabs(1.0-div(c0,c1)); if (dc>=a){ dc=a; O=s; }
testbeg; c=div(t,n); testend("O(n)");
testbeg; c=div(t,n*n); testend("O(n^2)");
testbeg; c=div(t,n*n*n); testend("O(n^3)");
testbeg; c=div(t,n*n*n*n); testend("O(n^4)");
testbeg; a=log(n); c=div(t,a); testend("O(log(n))");
testbeg; a=log(n); c=div(t,a*a); testend("O(log^2(n))");
testbeg; a=log(n); c=div(t,a*a*a); testend("O(log^3(n))");
testbeg; a=log(n); c=div(t,a*a*a*a); testend("O(log^4(n))");
testbeg; a=log(n); c=div(t,n*a); testend("O(n.log(n))");
testbeg; a=log(n); c=div(t,n*n*a); testend("O(n^2.log(n))");
testbeg; a=log(n); c=div(t,n*n*n*a); testend("O(n^3.log(n))");
testbeg; a=log(n); c=div(t,n*n*n*n*a); testend("O(n^4.log(n))");
testbeg; a=log(n); c=div(t,n*a*a); testend("O(n.log^2(n))");
testbeg; a=log(n); c=div(t,n*n*a*a); testend("O(n^2.log^2(n))");
testbeg; a=log(n); c=div(t,n*n*n*a*a); testend("O(n^3.log^2(n))");
testbeg; a=log(n); c=div(t,n*n*n*n*a*a); testend("O(n^4.log^2(n))");
testbeg; a=log(n); c=div(t,n*a*a*a); testend("O(n.log^3(n))");
testbeg; a=log(n); c=div(t,n*n*a*a*a); testend("O(n^2.log^3(n))");
testbeg; a=log(n); c=div(t,n*n*n*a*a*a); testend("O(n^3.log^3(n))");
testbeg; a=log(n); c=div(t,n*n*n*n*a*a*a); testend("O(n^4.log^3(n))");
testbeg; a=log(n); c=div(t,n*a*a*a*a); testend("O(n.log^4(n))");
testbeg; a=log(n); c=div(t,n*n*a*a*a*a); testend("O(n^2.log^4(n))");
testbeg; a=log(n); c=div(t,n*n*n*a*a*a*a); testend("O(n^3.log^4(n))");
testbeg; a=log(n); c=div(t,n*n*n*n*a*a*a*a); testend("O(n^4.log^4(n))");
#undef testend
#undef testbeg
return O+AnsiString().sprintf(" error = %.6lf",dc);
}
//---------------------------------------------------------------------------
__fastcall TForm1::TForm1(TComponent* Owner):TForm(Owner)
{
mm_log->Lines->Clear();
mm_log->Lines->Add("factorial "+get_complexity(factorial));
mm_log->Lines->Add("sort asc "+get_complexity(sort_asc));
mm_log->Lines->Add("sort desc "+get_complexity(sort_desc));
mm_log->Lines->Add("sort rand "+get_complexity(sort_rand));
}
//-------------------------------------------------------------------------
使用我的快速精确 bigint 阶乘的相关时间测量,我只使用了 8ms 以上的较大时间,以及上面输出的排序测量:
factorial O(n.log^2(n)) error = 0.665782
sort asc O(n) error = 0.076324
sort desc O(n^2) error = 0.037886
sort rand O(n^2) error = 0.075000
该代码仅测试少数支持的复杂性并输出具有最低错误的代码(c
不同之间的恒定时间变化n
)...
只需忽略 VCL 的东西并将 AnsiString 转换为您想要的任何字符串或输出...
推荐阅读
- python - 在 Gensim 中评估 TF-IDF 的有效性。为什么结果列表不完整?
- angular - 如何在 Nest.js 中获取用户信息?
- c# - 如何在插入语句中使用选择子查询?
- c# - 找不到源类型“TemplateBodyResponse”的查询模式的实现。未找到“选择”
- python - 如何解析没有明确分隔符的 txt.file
- java - 在没有 Main 的情况下调试或修补 .jar 文件
- python - 如何修复错误“_core.QgsProcessingException:执行算法时出错。”
- regex - 在 Oracle 中替换子字符串时如何替换转义特殊字符?
- python - 如何在谷歌的广度和深度模型中实现跨产品转换
- pandas - 如何使我的 x 标签在几个月内增加?