- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I used ICC (icc version 14.0.3 (gcc version 4.8.1 compatibility)) on OS X 10.9.4 to compile the following code:
#include <stdio.h> #include <stdlib.h> #include <string.h> #define LOOPS 100 #define TEST_LEN (2*1024*1024) static uint32_t g_Crc32Table[256]; void GetCrc32Table() { for (int i = 0; i < 256; i++) { uint32_t Crc = i; for (int j = 0; j < 8; j++) { if (Crc & 1) { Crc = (Crc >> 1) ^ 0x82F63B78; } else { Crc >>= 1; } } g_Crc32Table = Crc; } } uint32_t GetCrc32(const uint8_t* InStr, size_t len) { uint32_t crc = 0xffffffff; for (size_t i = 0; i < len; i++) { uint32_t tmp = InStr; crc = (crc >> 8) ^ g_Crc32Table[(crc ^ tmp) & 0xFF]; } return crc ^ 0xBAADBEEF; } int main(int argc, char** argv) { GetCrc32Table(); srand(1234); uint8_t *b = (uint8_t*) malloc(TEST_LEN); for (int i = 0; i < TEST_LEN; i++) b = (uint8_t) rand(); uint32_t crc32 = 0; for (int i = 0; i < LOOPS; i++) { crc32 = GetCrc32(b, TEST_LEN); } printf("%x\n", (uint32_t) crc32); free(b); }
with the following compile command line:
/opt/intel/bin/icc -O3 -std=c99 -o t t.c
The compilation succeeded and it gave the result:
47b54328
As you can see, my code just compute the CRC32 for a generated buffer for 100 times (as defined in LOOPS). But after I changed LOOPS to 99 and used the same compile command line, it gave a different result:
f8521437
There must be something wrong. So I changed the command line to use -O2 and LOOPS to 100:
/opt/intel/bin/icc -O2 -std=c99 -o t t.c
This time it gave the correct result (f8521437). I verified with clang (version 5.1 came with Xcode 5):
clang -O3 -std=c99 -o t t.c
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I just did some investigations with a disassembler.
When LOOPS=100 (the result is incorrect), ICC generates MMX code as below:
mov edx, 0FFFFFFFFh xor cl, cl mov rax, 0FFFFFFFFh movd xmm0, edx pshufd xmm0, xmm0, 0 loc_100000C45: ; CODE XREF: _main+2B6 mov r9d, edx xor r8d, r8d movdqa xmm1, xmm0 mov rsi, rax loc_100000C52: ; CODE XREF: _main+2AE movzx r10d, byte ptr [r8+r15] xor rsi, r10 movzx esi, sil inc r8 shr r9d, 8 psrld xmm1, 8 xor r9d, [r14+rsi*4] mov esi, r9d xor r10, rsi movzx r11d, r10b cmp r8, 200000h movd xmm2, dword ptr [r14+r11*4] pshufd xmm3, xmm2, 0 pxor xmm1, xmm3 jb short loc_100000C52 add cl, 4 cmp cl, 64h jb short loc_100000C45 psrldq xmm1, 0Ch lea rdi, asc_100004CE0 ; "%x\n" movd esi, xmm1 xor eax, eax xor esi, 0BAADBEEFh call _printf
But with LOOPS=99, ICC generates normal x86_64 code (even I added -xHost directive):
xor dl, dl mov eax, 0FFFFFFFFh loc_100000C52: ; CODE XREF: _main+295 mov esi, eax xor ecx, ecx loc_100000C56: ; CODE XREF: _main+28E mov r10d, esi movzx r8d, byte ptr [r15+rcx*2] xor rsi, r8 movzx r9d, sil shr r10d, 8 movzx r11d, byte ptr [r15+rcx*2+1] inc rcx xor r10d, [rbx+r9*4] mov esi, r10d xor r10, r11 movzx r8d, r10b shr esi, 8 xor esi, [rbx+r8*4] cmp rcx, 100000h jb short loc_100000C56 inc dl cmp dl, 3 jb short loc_100000C52 xor esi, 0BAADBEEFh lea rdi, asc_100004CE0 ; "%x\n" xor eax, eax call _printf
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I tested your code and I see the same issue if I'm using the same command line you used.
However the error does not show up if I compile with -O2 instead of -O3 or if I'm using the 15.0 beta compiler. Can you try one of those options?
Thanks,
Alex
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I mentioned in the first post that the result code is correct if I use -O2 option.
I am not in the 15.0 beta programme. :(
And it would be great if you could let me know whether it is a known bug and if not, whether the bug has been located?
Thank you.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page