c++ - g++ 和 clang++ 上的 std::normal_distribution 性能差异
问题描述
#include <random>
int main() {
std::vector<double> norms;
norms.reserve(1000000);
std::mt19937_64 mtEngine(42);
std::normal_distribution<> nd;
for (int i = 0; i != 1000000; ++i) {
norms.push_back(nd(mtEngine));
}
}
g++ -std=c++17 -O3
(10.2.0 版)和clang++ -std=c++17 -O3
(11.0.0 版)生成的二进制文件在性能上有显着差异。
$ time ./random_clang
./random_clang 0.11s user 0.00s system 99% cpu 0.113 total
$ time ./random_gcc
./random_gcc 0.03s user 0.00s system 99% cpu 0.032 total
以下是Compiler Explorer和valgrind --tool=callgrind
.
./random_clang
--------------------------------------------------------------------------------
Ir
--------------------------------------------------------------------------------
278,231,181 PROGRAM TOTALS
--------------------------------------------------------------------------------
Ir file:function
--------------------------------------------------------------------------------
135,606,558 ???:double std::generate_canonical<double, 53ul, std::mersenne_twister_engine<unsigned long, 64ul, 312ul, 156ul, 31ul, 13043109905998158313ul, 29ul, 6148914691236517205ul, 17ul, 8202884508482404352ul, 37ul, 18444473444759240704ul, 43ul, 6364136223846793005ul> >(std::mersenne_twister_engine<unsigned long, 64ul, 312ul, 156ul, 31ul, 13043109905998158313ul, 29ul, 6148914691236517205ul, 17ul, 8202884508482404352ul, 37ul, 18444473444759240704ul, 43ul, 6364136223846793005ul>&) [/home/xxx/EffectiveCpp/test/random_clang]
53,449,536 /build/glibc-eX1tMB/glibc-2.31/math/../sysdeps/x86_64/fpu/e_logl.S:__ieee754_logl [/usr/lib/x86_64-linux-gnu/libm-2.31.so]
32,096,514 ???:main [/home/xxx/EffectiveCpp/test/random_clang]
27,997,376 /build/glibc-eX1tMB/glibc-2.31/math/w_logl_compat.c:logl [/usr/lib/x86_64-linux-gnu/libm-2.31.so]
22,905,902 /build/glibc-eX1tMB/glibc-2.31/math/../sysdeps/ieee754/dbl-64/e_log.c:__ieee754_log_fma [/usr/lib/x86_64-linux-gnu/libm-2.31.so]
2,500,000 /build/glibc-eX1tMB/glibc-2.31/math/./w_log_template.c:log@@GLIBC_2.29 [/usr/lib/x86_64-linux-gnu/libm-2.31.so]
1,000,000 ???:0x0000000004a322f0 [???]
./random_gcc
--------------------------------------------------------------------------------
Ir
--------------------------------------------------------------------------------
125,607,194 PROGRAM TOTALS
--------------------------------------------------------------------------------
Ir file:function
--------------------------------------------------------------------------------
75,746,682 ???:main [/home/xxx/EffectiveCpp/test/random_gcc]
22,905,902 /build/glibc-eX1tMB/glibc-2.31/math/../sysdeps/ieee754/dbl-64/e_log.c:__ieee754_log_fma [/usr/lib/x86_64-linux-gnu/libm-2.31.so]
19,769,747 ???:std::mersenne_twister_engine<unsigned long, 64ul, 312ul, 156ul, 31ul, 13043109905998158313ul, 29ul, 6148914691236517205ul, 17ul, 8202884508482404352ul, 37ul, 18444473444759240704ul, 43ul, 6364136223846793005ul>::_M_gen_rand() [/home/xxx/EffectiveCpp/test/random_gcc]
2,500,000 /build/glibc-eX1tMB/glibc-2.31/math/./w_log_template.c:log@@GLIBC_2.29 [/usr/lib/x86_64-linux-gnu/libm-2.31.so]
1,000,000 ???:0x00000000001090f0 [???]
1,000,000 ???:0x0000000004a322f0 [???]
916,425 /build/glibc-eX1tMB/glibc-2.31/elf/dl-lookup.c:_dl_lookup_symbol_x [/usr/lib/x86_64-linux-gnu/ld-2.31.so]
544,815 /build/glibc-eX1tMB/glibc-2.31/elf/dl-lookup.c:do_lookup_x [/usr/lib/x86_64-linux-gnu/ld-2.31.so]
为什么clang++
版本花这么多时间在调用std::generate_canonical
?我见过有人声称g++
内联更积极,但在我的情况下更改选项clang++
并没有真正帮助(-mllvm -inline-threshold=10000
)。
这是一个错误还是我错过了一些其他重要的编译器选项?我知道还有其他方法可以更快地生成正态分布的随机变量,但我认为常用标准库函数的这种速度不一致是不正常的。
更新:似乎在我将clang++
版本链接到libc++
with-stdlib=libc++ -lc++abi
后,性能与原始g++
版本相当。
$ time ./random_perf
./random_perf 0.03s user 0.00s system 98% cpu 0.027 total
./random_perf
--------------------------------------------------------------------------------
Ir
--------------------------------------------------------------------------------
147,608,621 PROGRAM TOTALS
--------------------------------------------------------------------------------
Ir file:function
--------------------------------------------------------------------------------
106,311,924 /usr/lib/llvm-10/bin/../include/c++/v1/random:double std::__1::normal_distribution<double>::operator()<std::__1::mersenne_twister_engine<unsigned long, 64ul, 312ul, 156ul, 31ul, 13043109905998158313ul, 29ul, 6148914691236517205ul, 17ul, 8202884508482404352ul, 37ul, 18444473444759240704ul, 43ul, 6364136223846793005ul> >(std::__1::mersenne_twister_engine<unsigned long, 64ul, 312ul, 156ul, 31ul, 13043109905998158313ul, 29ul, 6148914691236517205ul, 17ul, 8202884508482404352ul, 37ul, 18444473444759240704ul, 43ul, 6364136223846793005ul>&, std::__1::normal_distribution<double>::param_type const&) [/home/xxx/EffectiveCpp/bin/random_perf]
22,905,902 /build/glibc-eX1tMB/glibc-2.31/math/../sysdeps/ieee754/dbl-64/e_log.c:__ieee754_log_fma [/usr/lib/x86_64-linux-gnu/libm-2.31.so]
6,000,007 /usr/lib/llvm-10/bin/../include/c++/v1/vector:main
3,003,122 /usr/lib/llvm-10/bin/../include/c++/v1/random:main
3,000,016 /home/xxx/EffectiveCpp/src/random_perf.cpp:main [/home/xxx/EffectiveCpp/bin/random_perf]
2,500,000 /build/glibc-eX1tMB/glibc-2.31/math/./w_log_template.c:log@@GLIBC_2.29 [/usr/lib/x86_64-linux-gnu/libm-2.31.so]
1,000,002 /usr/lib/llvm-10/bin/../include/c++/v1/memory:main
1,000,000 ???:0x000000000494b2f0 [???]
507,749 /build/glibc-eX1tMB/glibc-2.31/elf/dl-lookup.c:_dl_lookup_symbol_x [/usr/lib/x86_64-linux-gnu/ld-2.31.so]
解决方案
推荐阅读
- c# - 什么时候使用 await async 不好?
- mysql - 显示所有具有最高 id 列的电子邮件
- sql-server - 如何使用查询结果填充表?
- mongodb - mongo用数据替换数组中的ObjectId
- javascript - 如何映射一对多?
- geolocation - nativescript vue 中的位置示例未更新
- python - 如何在 python 中从 SharePoint 获取文件夹列表
- python - 如何在 python 中将 2D 数据帧转换为 2D 数组?
- reactjs - 为什么 React 条件导入仅适用于开发环境,但不适用于生产环境
- python - 为什么我不能再用 pandas 制作任何东西的 .exe 了?