c - Does the -O0 compiler flag have the same effect as the volatile keyword in C?
问题描述
When you use the -O0 compiler flag in C, you tell the compiler to avoid any kind of optimization. When you define a variable as volatile
, you tell the compiler to avoid optimizing that variable. Can we use the two approaches interchangeably? And if so what are the pros and cons? Below are some pros and cons that I can think of. Are there any more?
Pros:
- Using the -O0 flag is helpful if we have a big code base inside which the variables that should have been declared as
volatile
, are not. If the code is showing buggy behavior, instead of going in the code and finding which variables need to be declared as volatile, we can just use the -O0 flag to eliminate the possibility that optimization is causing the problem.
Cons:
- The -O0 flag will affect the entire code while the
volatile
keyword only affects a specific variable. If we're working on a small microcontroller for example, this could be a problem since using -O0 may produce a big executable.
解决方案
The short answer is: the volatile
keyword does not mean "do not optimize". It is something completely different. It informs the compiler that the variable may be changed by something which is not visible for the compiler in the normal program flow. For example:
- It can be changed by the hardware - usually registers mapped in the memory address space
- Can be changed by the function which is never called - for example the interrupt routine
- Variable can be changed by another process or hardware - for example shared memory in the multiprocessor / multicore systems
The volatile variable has to be read from its storage location every time it is used, and saved every time it was changed.
Here you have an example:
int foo(volatile int z)
{
return z + z + z + z;
}
int foo1(int z)
{
return z + z + z + z;
}
and the resulting code (-O0 optimization option)
foo(int):
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], edi
mov edx, DWORD PTR [rbp-4]
mov eax, DWORD PTR [rbp-4]
add edx, eax
mov eax, DWORD PTR [rbp-4]
add edx, eax
mov eax, DWORD PTR [rbp-4]
add eax, edx
pop rbp
ret
foo1(int):
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], edi
mov eax, DWORD PTR [rbp-4]
sal eax, 2
pop rbp
ret
The difference is obvious I think. The volatile variable is read 4 times, non volatile is read once, then multiplied by 4.
You can play yourself here: https://godbolt.org/g/RiTU4g
In the most cases if the program does not run when you turn on the compiler optimization, you have some hidden UBs in your code. You should debug as long as needed to discover all of them. The correctly written program must run at any optimization level.
Bear in mind that `volatile' does not mean or guarantee the coherency & atomicity.
推荐阅读
- sapui5 - 在 sap.m.Panel 中绑定分层数据
- c# - struct tostring() 方法正在重定向到抽象类 ValueType
- php - 可以在 webhost 上的 php 中使用命令行吗?
- javascript - 未添加 JQgrid 行
- mysql - 使用 mysql 查询提取字符串的某些部分
- android - 当弹出窗口处于活动状态时,使屏幕变暗,期望选定的视图
- javascript - 如何在允许 Promise 挂起的同时创建待处理的请求列表?
- java - 使用 docx4j 转换 docx 中的 AltChunks
- javascript - 穿透 Z-index
- javascript - Highcharts:跨多个图表同步轴刻度