<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://blog.sagardevachar.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://blog.sagardevachar.com/" rel="alternate" type="text/html" hreflang="en" /><updated>2025-10-08T08:05:13+05:30</updated><id>https://blog.sagardevachar.com/feed.xml</id><title type="html">Seemebadnekai</title><subtitle>A blog by SagarDevAchar</subtitle><entry><title type="html">Why we need “volatile” in C</title><link href="https://blog.sagardevachar.com/posts/need_for_volatile_in_c/" rel="alternate" type="text/html" title="Why we need “volatile” in C" /><published>2025-10-07T22:02:49+05:30</published><updated>2025-10-08T08:04:55+05:30</updated><id>https://blog.sagardevachar.com/posts/need_for_volatile_in_c</id><content type="html" xml:base="https://blog.sagardevachar.com/posts/need_for_volatile_in_c/"><![CDATA[<!-- ![volatile to the rescue](https://i.imgflip.com/a8fcsw.jpg) -->

<h2 id="intro-to-compilers">Intro to Compilers</h2>

<p>Compilers — this critical piece of software acts as the very bridge between imagination and reality, enabling developers to run code on all sorts of hardware — everything from tiny embedded systems to massive data centers.</p>

<p>A compiler not only translates code, but also optimizes it — reordering, simplifying, and even removing instructions to improve performance.</p>

<p>While this is generally beneficial, it can sometimes clash with the programmer’s intent, especially in systems where precise control over hardware and memory is crucial (e.g., embedded systems).</p>

<p>This is where the <code class="language-plaintext highlighter-rouge">volatile</code> keyword comes into play, guiding the compiler to respect certain constraints to ensure correct behavior in concurrent and low-level code.</p>

<h2 id="the-setup">The Setup</h2>

<p>Here we have a simple piece of code where a flag, <code class="language-plaintext highlighter-rouge">system_ready</code>, is used to busy wait before starting the actual application code. This flag is accessed by other code files using <code class="language-plaintext highlighter-rouge">extern</code> to signal to the main program that it is ready to proceed.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
</pre></td><td class="rouge-code"><pre><span class="cp">#include</span> <span class="cpf">"main.h"</span><span class="cp">
</span>
<span class="kt">uint8_t</span> <span class="n">system_ready</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="cm">/* System Init */</span>

    <span class="c1">// Busy wait until the flag is set and indicate by LED when ready</span>
    <span class="k">while</span> <span class="p">(</span><span class="n">system_ready</span> <span class="o">==</span> <span class="mi">0</span><span class="p">);</span>
    <span class="n">HAL_GPIO_WritePin</span><span class="p">(</span><span class="n">LED_GPIO_Port</span><span class="p">,</span> <span class="n">LED_Pin</span><span class="p">,</span> <span class="n">GPIO_PIN_SET</span><span class="p">);</span>

    <span class="k">while</span> <span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
        <span class="cm">/* Looping user code */</span>
    <span class="p">}</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>On compilation, this is what we expect to see in the disassembly:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
</pre></td><td class="rouge-code"><pre>...

  while (system_ready == 0);
 800023a:	4b09      	ldr	r3, [pc, #36]	; (8000260 &lt;main+0x38&gt;)     &lt;--- Load address of 'system_ready' to R3 &lt;-----+
 800023c:	781b      	ldrb	r3, [r3, #0]                                &lt;--- Load a byte from the address to R3         |
 800023e:	2b00      	cmp	r3, #0                                      &lt;--- Compare R3 to 0                            |
 8000240:	d0fb      	beq.n	800023a &lt;main+0x12&gt;                         &lt;--- If equal, loop back to address 800023a ----+
  HAL_GPIO_WritePin(LED_GPIO_Port, LED_Pin, GPIO_PIN_SET);

...

 8000260:	20000028 	.word	0x20000028                                  &lt;--- RAM address of 'system_ready' variable

...
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Every time, we load the variable <code class="language-plaintext highlighter-rouge">system_ready</code> from memory, compare it to 0 and then either loop or break out based on the outcome of the comparison (as intended by the programmer).</p>

<p>But, as we increase the optimization level to <code class="language-plaintext highlighter-rouge">-O1</code> (or even <code class="language-plaintext highlighter-rouge">-Os</code>, which is the default for embedded systems), we see this instead</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
</pre></td><td class="rouge-code"><pre>...

  while (system_ready == 0);
 8000236:	4b0b      	ldr	r3, [pc, #44]	; (8000264 &lt;main+0x40&gt;)     &lt;--- Load address of 'system_ready' to R3
 8000238:	781b      	ldrb	r3, [r3, #0]                                &lt;--- Load a byte from the address to R3
 800023a:	2b00      	cmp	r3, #0                                      &lt;--- Compare R3 to 0 &lt;--------------------------+
 800023c:	d0fd      	beq.n	800023a &lt;main+0x16&gt;                         &lt;--- If equal, loop back to address 800023a ----+
  HAL_GPIO_WritePin(LED_GPIO_Port, LED_Pin, GPIO_PIN_SET);

...

 8000264:	20000028 	.word	0x20000028                                  &lt;--- RAM address of 'system_ready' variable

...
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Here, the variable is loaded <strong>only once</strong> from memory, kind of like caching the variable, and the same cached value is repeatedly compared to 0. The compiler assumes the variable won’t change unexpectedly and optimizes the code by removing repeated memory reads. As a result, if the variable hasn’t become non-zero <em>before</em> that initial load, the code gets stuck in an infinite loop, with no way to break out.</p>

<blockquote class="prompt-info">
  <p>Compilation results and outputs may vary depending on the compiler version, target architecture, optimization settings, and maybe even black magic.</p>
</blockquote>

<p>This behavior would be entirely unexpected to the programmer, potentially resulting in hours — or even days — of unnecessary and most likely unproductive debugging.</p>

<h2 id="volatile-to-the-rescue">“volatile” to the rescue</h2>

<p><code class="language-plaintext highlighter-rouge">volatile</code> is the magic word that instructs the compiler to be mindful of the variable during optimization. It informs the compiler that its value <strong>can change at any time</strong>, outside the normal program flow, and hence <strong>not to optimize</strong> anything related to that variable.</p>

<p>Let’s make a small modification to the example above — we’ll add the <code class="language-plaintext highlighter-rouge">volatile</code> type qualifier to the <code class="language-plaintext highlighter-rouge">system_ready</code> variable, as shown below:</p>

<div class="language-c nolineno highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="k">volatile</span> <span class="kt">uint8_t</span> <span class="n">system_ready</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This tells the compiler <strong>not</strong> to optimize access to the variable and to <strong>check the actual memory each time</strong>, rather than relying on a cached value. The resulting disassembly (compiled with <code class="language-plaintext highlighter-rouge">-O3</code>) is shown below:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
</pre></td><td class="rouge-code"><pre>...

  while (system_ready == 0);
 800023a:	4a0a      	ldr	r2, [pc, #40]	; (8000264 &lt;main+0x3c&gt;)     &lt;--- Load address of 'system_ready' to R2
 800023c:	7813      	ldrb	r3, [r2, #0]                                &lt;--- Load a byte from the address to R3 &lt;-------+
 800023e:	2b00      	cmp	r3, #0                                      &lt;--- Compare R3 to 0                            |
 8000240:	d0fc      	beq.n	800023c &lt;main+0x14&gt;                         &lt;--- If equal, loop back to address 800023c ----+
  HAL_GPIO_WritePin(LED_GPIO_Port, LED_Pin, GPIO_PIN_SET);

  ...

 8000264:	20000028 	.word	0x20000028                                  &lt;--- RAM address of 'system_ready' variable

  ...
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And just like that, a single keyword saves us from hours of head-scratching. <code class="language-plaintext highlighter-rouge">volatile</code> to the rescue!</p>

<h2 id="tldr">TL;DR</h2>

<h3 id="what-it-does">What it does?</h3>

<p>The <code class="language-plaintext highlighter-rouge">volatile</code> keyword tells the compiler not to optimize reads / writes to a variable because its value might change unexpectedly (e.g., due to hardware or another thread).</p>

<h3 id="why-do-we-need-it">Why do we need it?</h3>

<p>When using higher optimization levels, compilers may overlook the importance of a seemingly unchanged variable, resulting in incorrect logic in the generated binary.</p>

<h3 id="when-to-use-it">When to use it?</h3>

<ul>
  <li>Busy wait loops / Polling hardware flags</li>
  <li>Variable accessed / updated by ISRs or signal handlers</li>
  <li>Variable shared between threads / tasks / processes</li>
  <li>Variable mapped to hardware (e.g., memory-mapped I/O, DMA-controlled buffers)</li>
</ul>

<hr />

<p>Until next time, may your pointers never be null.</p>]]></content><author><name></name></author><category term="technical" /><category term="embedded" /><category term="volatile" /><category term="compiler" /><category term="optimization" /><category term="debugging" /><category term="c" /><summary type="html"><![CDATA[]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.sagardevachar.com/assets/img/static/2025-10-07-volatile_meme.png" /><media:content medium="image" url="https://blog.sagardevachar.com/assets/img/static/2025-10-07-volatile_meme.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>