c++ - What size should be aimed for when breaking up large data structures for processing? Specifically segment size of sieve
问题描述
Some large data structures have low locality of reference. This is bad for caching. I am implementing the Sieve of Eratosthenes. It contains a long list of numbers. It is possible to process the list in segments to improve the cache hit ratio. What size should the segment size be? I've heard L1 instruction set cache should work best. Based on my tests a larger size works better, though I'm basically just trying different sizes. Why would sizes larger than the L1 cache actually result in an overall faster execution time? Should the cache size be for a single core or all the cores? Other than CPU cache are there other possible values, like page size?
I know profiling would determine the best size, but I need to know which values to test in the first place.
In the Windows task manager it says the computer has 256KB L1 cache. But in CPU-Z it says L1 Data is "4x32 Kbytes". In my program the list of numbers is represented by a std::vector<unsigned char>
Since an unsigned-char
is ` byte, and I'm guessing Kbytes is actually kibibytes, so the segment size would be 32x1024. Because there's 2^10 bytes in a kibibyte. Is all this correct?
解决方案
推荐阅读
- javascript - @types/node/index.d.ts(6208,55): 错误 TS2304: 找不到名称 'Map' 'Set' 'Symbol' 'WeakMap' 'WeakSet'
- ios - 使用 UITabBarController 的持久视图
- forms - Symfony4 Forms - 你如何有条件地禁用表单域?
- python - 如何在 PyQGIS 中使用 cpt-city 目录中的颜色渐变
- python - Celery Beat 不适用于 Django
- javascript - Vuejs - 路由器不会在应用程序中呈现第三个组件。前两个正在工作
- angular - Angular 6 找不到 .angular-cli.json 并更改 outDir
- java - 一个实例的多个属性
- javascript - jQuery 编写用于添加和删除 css 类的 if 参数的最短方法
- java - 如何在没有不连贯状态的情况下同步吸气剂?