Post

Why we need "volatile" in C

Why we need "volatile" in C

Intro to Compilers

Compilers — this critical piece of software acts as the very bridge between imagination and reality, enabling developers to run code on all sorts of hardware — everything from tiny embedded systems to massive data centers.

A compiler not only translates code, but also optimizes it — reordering, simplifying, and even removing instructions to improve performance.

While this is generally beneficial, it can sometimes clash with the programmer’s intent, especially in systems where precise control over hardware and memory is crucial (e.g., embedded systems).

This is where the volatile keyword comes into play, guiding the compiler to respect certain constraints to ensure correct behavior in concurrent and low-level code.

The Setup

Here we have a simple piece of code where a flag, system_ready, is used to busy wait before starting the actual application code. This flag is accessed by other code files using extern to signal to the main program that it is ready to proceed.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include "main.h"

uint8_t system_ready = 0;

int main() {
    /* System Init */

    // Busy wait until the flag is set and indicate by LED when ready
    while (system_ready == 0);
    HAL_GPIO_WritePin(LED_GPIO_Port, LED_Pin, GPIO_PIN_SET);

    while (1) {
        /* Looping user code */
    }
}

On compilation, this is what we expect to see in the disassembly:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
...

  while (system_ready == 0);
 800023a:	4b09      	ldr	r3, [pc, #36]	; (8000260 <main+0x38>)     <--- Load address of 'system_ready' to R3 <-----+
 800023c:	781b      	ldrb	r3, [r3, #0]                                <--- Load a byte from the address to R3         |
 800023e:	2b00      	cmp	r3, #0                                      <--- Compare R3 to 0                            |
 8000240:	d0fb      	beq.n	800023a <main+0x12>                         <--- If equal, loop back to address 800023a ----+
  HAL_GPIO_WritePin(LED_GPIO_Port, LED_Pin, GPIO_PIN_SET);

...

 8000260:	20000028 	.word	0x20000028                                  <--- RAM address of 'system_ready' variable

...

Every time, we load the variable system_ready from memory, compare it to 0 and then either loop or break out based on the outcome of the comparison (as intended by the programmer).

But, as we increase the optimization level to -O1 (or even -Os, which is the default for embedded systems), we see this instead

1
2
3
4
5
6
7
8
9
10
11
12
13
14
...

  while (system_ready == 0);
 8000236:	4b0b      	ldr	r3, [pc, #44]	; (8000264 <main+0x40>)     <--- Load address of 'system_ready' to R3
 8000238:	781b      	ldrb	r3, [r3, #0]                                <--- Load a byte from the address to R3
 800023a:	2b00      	cmp	r3, #0                                      <--- Compare R3 to 0 <--------------------------+
 800023c:	d0fd      	beq.n	800023a <main+0x16>                         <--- If equal, loop back to address 800023a ----+
  HAL_GPIO_WritePin(LED_GPIO_Port, LED_Pin, GPIO_PIN_SET);

...

 8000264:	20000028 	.word	0x20000028                                  <--- RAM address of 'system_ready' variable

...

Here, the variable is loaded only once from memory, kind of like caching the variable, and the same cached value is repeatedly compared to 0. The compiler assumes the variable won’t change unexpectedly and optimizes the code by removing repeated memory reads. As a result, if the variable hasn’t become non-zero before that initial load, the code gets stuck in an infinite loop, with no way to break out.

Compilation results and outputs may vary depending on the compiler version, target architecture, optimization settings, and maybe even black magic.

This behavior would be entirely unexpected to the programmer, potentially resulting in hours — or even days — of unnecessary and most likely unproductive debugging.

“volatile” to the rescue

volatile is the magic word that instructs the compiler to be mindful of the variable during optimization. It informs the compiler that its value can change at any time, outside the normal program flow, and hence not to optimize anything related to that variable.

Let’s make a small modification to the example above — we’ll add the volatile type qualifier to the system_ready variable, as shown below:

1
volatile uint8_t system_ready = 0;

This tells the compiler not to optimize access to the variable and to check the actual memory each time, rather than relying on a cached value. The resulting disassembly (compiled with -O3) is shown below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
...

  while (system_ready == 0);
 800023a:	4a0a      	ldr	r2, [pc, #40]	; (8000264 <main+0x3c>)     <--- Load address of 'system_ready' to R2
 800023c:	7813      	ldrb	r3, [r2, #0]                                <--- Load a byte from the address to R3 <-------+
 800023e:	2b00      	cmp	r3, #0                                      <--- Compare R3 to 0                            |
 8000240:	d0fc      	beq.n	800023c <main+0x14>                         <--- If equal, loop back to address 800023c ----+
  HAL_GPIO_WritePin(LED_GPIO_Port, LED_Pin, GPIO_PIN_SET);

  ...

 8000264:	20000028 	.word	0x20000028                                  <--- RAM address of 'system_ready' variable

  ...

And just like that, a single keyword saves us from hours of head-scratching. volatile to the rescue!

TL;DR

What it does?

The volatile keyword tells the compiler not to optimize reads / writes to a variable because its value might change unexpectedly (e.g., due to hardware or another thread).

Why do we need it?

When using higher optimization levels, compilers may overlook the importance of a seemingly unchanged variable, resulting in incorrect logic in the generated binary.

When to use it?

  • Busy wait loops / Polling hardware flags
  • Variable accessed / updated by ISRs or signal handlers
  • Variable shared between threads / tasks / processes
  • Variable mapped to hardware (e.g., memory-mapped I/O, DMA-controlled buffers)

Until next time, may your pointers never be null.

This post is licensed under CC BY 4.0 by the author.