<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Critical Code Gen Bug: Silent Data Corruption in Vectorized Modulo Loops (-O2) in Intel® oneAPI DPC++/C++ Compiler</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Critical-Code-Gen-Bug-Silent-Data-Corruption-in-Vectorized/m-p/1736915#M4679</link>
    <description>&lt;P&gt;The issue is known, and it will be fixed in the next release.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;$ icx -V&lt;BR /&gt;Intel(R) oneAPI DPC++/C++ Compiler for applications running on Intel(R) 64, Version 2025.3.2 Build 20260112&lt;/P&gt;
&lt;P&gt;$icx -O2 vec-bug.c &amp;amp;&amp;amp; ./a.out&lt;BR /&gt;Index 4: Expected 3, Got 2&lt;BR /&gt;Index 8: Expected 3, Got 2&lt;BR /&gt;Index 12: Expected 3, Got 2&lt;BR /&gt;Index 16: Expected 3, Got 2&lt;BR /&gt;Total Failures: 4&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;$ icx vec-bug.c -O2 &amp;amp;&amp;amp;./a.out&lt;BR /&gt;$ icx -V&lt;BR /&gt;Intel(R) oneAPI DPC++/C++ Compiler for applications running on Intel(R) 64, Upcoming Release.&lt;/P&gt;</description>
    <pubDate>Fri, 13 Feb 2026 00:43:36 GMT</pubDate>
    <dc:creator>Viet_H_Intel</dc:creator>
    <dc:date>2026-02-13T00:43:36Z</dc:date>
    <item>
      <title>Critical Code Gen Bug: Silent Data Corruption in Vectorized Modulo Loops (-O2)</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Critical-Code-Gen-Bug-Silent-Data-Corruption-in-Vectorized/m-p/1736172#M4672</link>
      <description>&lt;P&gt;&lt;STRONG&gt;Severity:&lt;/STRONG&gt; High (Silent Data Corruption) &lt;STRONG&gt;Component:&lt;/STRONG&gt; Loop Vectorizer / Code Generation &lt;STRONG&gt;Compiler:&lt;/STRONG&gt; Intel(R) oneAPI DPC++/C++ Compiler (ICX) &lt;STRONG&gt;Flags:&lt;/STRONG&gt; -O2 (Reproduces at -O2, -O3, and with -xCORE-AVX512/ -xCORE-AVX2&amp;nbsp;/ -xAVX)&lt;/P&gt;&lt;H4&gt;&lt;STRONG&gt;Summary&lt;/STRONG&gt;&lt;/H4&gt;&lt;P&gt;The ICX compiler generates logically incorrect code when vectorizing loops that initialize arrays using modulo operations with power-of-2 divisors (e.g., i % 4). This results in silent data corruption where specific elements in the sequence are written with the wrong values.&lt;/P&gt;&lt;P&gt;The issue persists even when the loop bound is a variable, indicating a fundamental flaw in the vectorizer's pattern generation logic, not just a constant-folding error.&lt;/P&gt;&lt;H4&gt;&lt;STRONG&gt;Reproduction Code (Variable Size)&lt;/STRONG&gt;&lt;/H4&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;SPAN class=""&gt;C&lt;/SPAN&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;PRE&gt;&lt;SPAN class=""&gt;#&lt;SPAN class=""&gt;include&lt;/SPAN&gt; &lt;SPAN class=""&gt;&amp;lt;immintrin.h&amp;gt;&lt;/SPAN&gt;&lt;/SPAN&gt;
&lt;SPAN class=""&gt;#&lt;SPAN class=""&gt;include&lt;/SPAN&gt; &lt;SPAN class=""&gt;&amp;lt;stdint.h&amp;gt;&lt;/SPAN&gt;&lt;/SPAN&gt;
&lt;SPAN class=""&gt;#&lt;SPAN class=""&gt;include&lt;/SPAN&gt; &lt;SPAN class=""&gt;&amp;lt;stdio.h&amp;gt;&lt;/SPAN&gt;&lt;/SPAN&gt;

&lt;SPAN class=""&gt;// Bug triggers even with variable 'size' parameter&lt;/SPAN&gt;
&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;void&lt;/SPAN&gt; &lt;SPAN class=""&gt;init_arr&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;SPAN class=""&gt;int16_t&lt;/SPAN&gt; * a_buf, &lt;SPAN class=""&gt;int&lt;/SPAN&gt; size)&lt;/SPAN&gt;
&lt;/SPAN&gt;{
    &lt;SPAN class=""&gt;for&lt;/SPAN&gt; (&lt;SPAN class=""&gt;int&lt;/SPAN&gt; i = &lt;SPAN class=""&gt;0&lt;/SPAN&gt;; i &amp;lt; size; i++) {
        &lt;SPAN class=""&gt;// Pattern: 3, 2, 2, 2, 3, 2, 2, 2...&lt;/SPAN&gt;
        &lt;SPAN class=""&gt;if&lt;/SPAN&gt; ((i % &lt;SPAN class=""&gt;4&lt;/SPAN&gt;) == &lt;SPAN class=""&gt;0&lt;/SPAN&gt;) {
            a_buf[i] = &lt;SPAN class=""&gt;3&lt;/SPAN&gt;;
        } &lt;SPAN class=""&gt;else&lt;/SPAN&gt; {
            a_buf[i] = &lt;SPAN class=""&gt;2&lt;/SPAN&gt;;
        }
    }
}

&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;int&lt;/SPAN&gt; &lt;SPAN class=""&gt;main&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;SPAN class=""&gt;void&lt;/SPAN&gt;)&lt;/SPAN&gt;
&lt;/SPAN&gt;{
    &lt;SPAN class=""&gt;// Test with size 17&lt;/SPAN&gt;
    &lt;SPAN class=""&gt;int&lt;/SPAN&gt; size = &lt;SPAN class=""&gt;17&lt;/SPAN&gt;;
    &lt;SPAN class=""&gt;int16_t&lt;/SPAN&gt; * a_buf = (&lt;SPAN class=""&gt;int16_t&lt;/SPAN&gt; *)_mm_malloc (size * &lt;SPAN class=""&gt;sizeof&lt;/SPAN&gt; (&lt;SPAN class=""&gt;int16_t&lt;/SPAN&gt;), &lt;SPAN class=""&gt;64&lt;/SPAN&gt;);

    init_arr(a_buf, size);

    &lt;SPAN class=""&gt;// Verification&lt;/SPAN&gt;
    &lt;SPAN class=""&gt;int&lt;/SPAN&gt; failure_count = &lt;SPAN class=""&gt;0&lt;/SPAN&gt;;
    &lt;SPAN class=""&gt;for&lt;/SPAN&gt; (&lt;SPAN class=""&gt;int&lt;/SPAN&gt; i = &lt;SPAN class=""&gt;0&lt;/SPAN&gt;; i &amp;lt; size; i++)
    {
        &lt;SPAN class=""&gt;int16_t&lt;/SPAN&gt; expected = ((i % &lt;SPAN class=""&gt;4&lt;/SPAN&gt;) == &lt;SPAN class=""&gt;0&lt;/SPAN&gt;) ? &lt;SPAN class=""&gt;3&lt;/SPAN&gt; : &lt;SPAN class=""&gt;2&lt;/SPAN&gt;;
        &lt;SPAN class=""&gt;if&lt;/SPAN&gt; (a_buf[i] != expected) {
            &lt;SPAN class=""&gt;printf&lt;/SPAN&gt;(&lt;SPAN class=""&gt;"Index %d: Expected %d, Got %d\n"&lt;/SPAN&gt;, i, expected, a_buf[i]);
            failure_count++;
        }
    }

    &lt;SPAN class=""&gt;if&lt;/SPAN&gt; (failure_count &amp;gt; &lt;SPAN class=""&gt;0&lt;/SPAN&gt;) &lt;SPAN class=""&gt;printf&lt;/SPAN&gt;(&lt;SPAN class=""&gt;"Total Failures: %d\n"&lt;/SPAN&gt;, failure_count);
    
    _mm_free (a_buf);
    &lt;SPAN class=""&gt;return&lt;/SPAN&gt; (failure_count == &lt;SPAN class=""&gt;0&lt;/SPAN&gt;) ? &lt;SPAN class=""&gt;0&lt;/SPAN&gt; : &lt;SPAN class=""&gt;1&lt;/SPAN&gt;;
}&lt;/PRE&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;H4&gt;&lt;STRONG&gt;Observed Behavior&lt;/STRONG&gt;&lt;/H4&gt;&lt;P&gt;When compiled with -O2, the code fails to write the value 3 at indices 4, 8, 12, 16. It instead writes 2.&lt;/P&gt;&lt;H4&gt;&lt;STRONG&gt;Disassembly Analysis (Proof of Logic Error)&lt;/STRONG&gt;&lt;/H4&gt;&lt;P&gt;The generated assembly for -O2 (SSE/AVX) shows that the compiler explicitly hardcodes the wrong values. For the tail case at index 16 (where size=17), the compiler emits a scalar store of 2 instead of 3.&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;SPAN class=""&gt;Code snippet&lt;/SPAN&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;PRE&gt;# Disassembly of init_arr (Intel Syntax)
...
# Vector stores (filling the array with incorrect patterns)
movups %xmm0, (%rdi)
movups %xmm0, 0x10(%rdi)

# CRITICAL ERROR:
# At offset 0x20 (Index 16), the compiler hardcodes immediate value 2.
# Since 16 % 4 == 0, this instruction SHOULD be writing 3.
movw   $0x2, 0x20(%rdi)  &amp;lt;-- Logic Error
...&lt;/PRE&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;H4&gt;&lt;STRONG&gt;Workarounds&lt;/STRONG&gt;&lt;/H4&gt;&lt;OL&gt;&lt;LI&gt;&lt;P&gt;&lt;STRONG&gt;Disable Vectorization:&lt;/STRONG&gt; #pragma novector immediately before the loop.&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;&lt;STRONG&gt;Volatile Divisor:&lt;/STRONG&gt; Making the divisor (e.g., 4) a volatile variable breaks the pattern recognition optimization.&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;&lt;STRONG&gt;Non-Power-of-2:&lt;/STRONG&gt; Changing the modulo to % 3 or % 5 produces correct code.&lt;/P&gt;&lt;/LI&gt;&lt;/OL&gt;</description>
      <pubDate>Fri, 06 Feb 2026 15:17:04 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Critical-Code-Gen-Bug-Silent-Data-Corruption-in-Vectorized/m-p/1736172#M4672</guid>
      <dc:creator>HavardGraff</dc:creator>
      <dc:date>2026-02-06T15:17:04Z</dc:date>
    </item>
    <item>
      <title>Re: Critical Code Gen Bug: Silent Data Corruption in Vectorized Modulo Loops (-O2)</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Critical-Code-Gen-Bug-Silent-Data-Corruption-in-Vectorized/m-p/1736915#M4679</link>
      <description>&lt;P&gt;The issue is known, and it will be fixed in the next release.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;$ icx -V&lt;BR /&gt;Intel(R) oneAPI DPC++/C++ Compiler for applications running on Intel(R) 64, Version 2025.3.2 Build 20260112&lt;/P&gt;
&lt;P&gt;$icx -O2 vec-bug.c &amp;amp;&amp;amp; ./a.out&lt;BR /&gt;Index 4: Expected 3, Got 2&lt;BR /&gt;Index 8: Expected 3, Got 2&lt;BR /&gt;Index 12: Expected 3, Got 2&lt;BR /&gt;Index 16: Expected 3, Got 2&lt;BR /&gt;Total Failures: 4&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;$ icx vec-bug.c -O2 &amp;amp;&amp;amp;./a.out&lt;BR /&gt;$ icx -V&lt;BR /&gt;Intel(R) oneAPI DPC++/C++ Compiler for applications running on Intel(R) 64, Upcoming Release.&lt;/P&gt;</description>
      <pubDate>Fri, 13 Feb 2026 00:43:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Critical-Code-Gen-Bug-Silent-Data-Corruption-in-Vectorized/m-p/1736915#M4679</guid>
      <dc:creator>Viet_H_Intel</dc:creator>
      <dc:date>2026-02-13T00:43:36Z</dc:date>
    </item>
  </channel>
</rss>

