performance - AVX2 指令延迟和吞吐量
问题描述
我对以下内在函数/指令的性能属性感兴趣:
_mm256_andnot_si256
/vpandn
_mm256_and_si256
/vpand
_mm256_cmpgt_epi32
/vpcmpgtd
- 和其他一些。
但不幸的是,英特尔内部指南不包含这些内部/指令的延迟和吞吐量数字表。我在哪里可以找到这些信息?
解决方案
延迟和吞吐量数字的三个来源是:
InstlatX64 列出了许多不同形式的指令(内存和/或寄存器操作数、不同的操作数宽度等),但没有关于每个执行端口的微操作数的信息。对于性能优化,不仅延迟和吞吐量数字很重要,而且每个执行端口的这些微操作也非常相关。此信息由 Agner Fog 的说明表和 uops.info 提供。
推荐阅读
- c# - Remove elements from List A that are in List B while keeping any duplicates in List A
- php - how to show warning of empty field on form validation?
- firebase - How to show just QuerySnapshots with boolean true in ListView? (Dart/Flutter)
- javascript - Asynchronous functions in an else block
- c++ - X11: How to get current top-left corner coordinates in visible area, when panning with mouse?
- postgresql - Why is the range of the timestamp type 4713 BC to 294276 AD?
- docker-compose - Errors while create channel in hyperledger fabric
- reactjs - Register values with useForm and react-select-country-list
- python - Unable to downgrade/uninstall shap (windows)
- flutter - I get this message while run the code which I got it from codecanyon?