Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

icpc dumps core using Intel supplied sample code

dkokron
Beginner
589 Views

I started playing with the codes available from https://software.intel.com/en-us/articles/benefits-of-intel-avx-for-small-matrices. ; I got the Determinant4x4Matrices.cpp code to compile by replacing <gmmintrin.h> with  <immintrin.h>.  The code runs fine when compiled with g++, but drops core when compiled with icpc.  I see the same behaviour under fedora 21 and SLES11sp3.

uname -a
Linux XXXXXX 4.1.13-100.fc21.x86_64 #1 SMP Tue Nov 10 13:13:20 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
g++ --version
g++ (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)

g++ -mavx Determinant4x4Matrices.cpp
 ./a.out
Welcome to Determinat4x4Matrices
256-bit results matched for evaluation of a determinant
128-bit results matched for evaluation of a determinant

icpc -V
Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 16.0.1.150 Build 20151021

icpc -xAVX Determinant4x4Matrices.cpp
./a.out
Welcome to Determinat4x4Matrices
256-bit results matched for evaluation of a determinant
128-bit results matched for evaluation of a determinant
Segmentation fault (core dumped)

GDB says .....

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000003a40483a18 in _int_free (have_lock=0, p=<optimized out>, av=0x3a407b7cc0 <main_arena>) at malloc.c:3990
3990        unlink(av, nextchunk, bck, fwd);
(gdb) where
#0  0x0000003a40483a18 in _int_free (have_lock=0, p=<optimized out>, av=0x3a407b7cc0 <main_arena>) at malloc.c:3990
#1  __GI___libc_free (mem=<optimized out>) at malloc.c:2951
#2  0x00000000004013ea in DeAllocateBuffers () at Determinant4x4Matrices.cpp:116
#3  0x00000000004048ff in main (argc=1, argp=0x7ffd0b400868) at Determinant4x4Matrices.cpp:627

 

0 Kudos
5 Replies
Light_Intel
Moderator
589 Views

The code sample you are using had been posted a few years ago but should be sound. I will try and duplicate the behavior on my side and see if I ran into the same issues.

Thank you for flagging it.

Noga

Intel Developer Support

0 Kudos
Light_Intel
Moderator
589 Views

To make sure I am using exactly the same code you are using, can you please attach it to this thread?

Thanks,

Noga

0 Kudos
dkokron
Beginner
589 Views

attaching my version of the Determinant4x4Matrices.cpp code

0 Kudos
Light_Intel
Moderator
589 Views

I was able to duplicate your issue on my side using the attached code sample. It seem to be an issue with the Intel Linux* compiler. The same code was fine using the Intel C++ compiler in Windows*.

I have escalated the issue to our engineers DPD200379563

Thank you for reporting it.

Regards,

Noga

0 Kudos
Light_Intel
Moderator
589 Views

This issue had been marked 'as not a defect'. Please see comments below.

It seems like it is a test issue.

I just expanded memory size allocated for OutputBuffer and test started to pass.

bash-4.2$  diff Determinant4x4Matrices.cpp Determinant4x4Matrices_.cpp
4c4,6
< #include <gmmintrin.h>
---
> #include <immintrin.h>
> #include <stdlib.h>
>
106c108
<     OutputBuffer = (float*) _mm_malloc(iBufferSizeBytes, AlignmentBytes); //should be 32bytes alignment for AVX
---
>     OutputBuffer = (float*) _mm_malloc(iBufferSizeBytes+48, AlignmentBytes); //should be 32bytes alignment for AVX
bash-4.2$

Some details.
As I noted earlier, the test case always fails with GCC 5.3.0, so I will show the issue using this version. When I use ICC the issue is unstable.

bash-4.2$ g++ --version
g++ (GCC) 5.3.0
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

bash-4.2$ cat Determinant4x4Matrices_.cpp | grep Out | grep mal
    OutputBuffer = (float*) _mm_malloc(iBufferSizeBytes, AlignmentBytes); //should be 32bytes alignment for AVX

bash-4.2$ g++ -mavx Determinant4x4Matrices_.cpp

bash-4.2$ ./a.out
Welcome to Determinat4x4Matrices
256-bit results matched for evaluation of a determinant
128-bit results matched for evaluation of a determinant
*** Error in `./a.out': free(): invalid next size (fast): 0x0000000001d3ee80 ***
...
Aborted (core dumped)

valgrind reports about invalid write at sse_determinant() function.

bash-4.2$ valgrind --leak-check=yes ./a.out
==4429== Memcheck, a memory error detector
==4429== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==4429== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==4429== Command: ./a.out
==4429==
Welcome to Determinat4x4Matrices
256-bit results matched for evaluation of a determinant
==4429== Invalid write of size 8
==4429==    at 0x4024BB: sse_determinant() (in /nfs/ins/proj/icl/lrb/users/agrische/tests/379563/a.out)

If it related to call of _mm_store intrinsic at the end of the function.

If I expand memory size for OutputBuffer then the error leaves.

bash-4.2$ !ca
cat Determinant4x4Matrices_.cpp | grep Out | grep mal
    OutputBuffer = (float*) _mm_malloc(iBufferSizeBytes+48, AlignmentBytes); //should be 32bytes alignment for AVX

bash-4.2$ g++ -mavx Determinant4x4Matrices_.cpp

bash-4.2$ ./a.out
Welcome to Determinat4x4Matrices
256-bit results matched for evaluation of a determinant
128-bit results matched for evaluation of a determinant

bash-4.2$ valgrind --leak-check=yes ./a.out
==4469== Memcheck, a memory error detector
==4469== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==4469== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==4469== Command: ./a.out
==4469==
Welcome to Determinat4x4Matrices
256-bit results matched for evaluation of a determinant
128-bit results matched for evaluation of a determinant
==4469==
==4469== HEAP SUMMARY:
==4469==     in use at exit: 72,704 bytes in 1 blocks
==4469==   total heap usage: 44 allocs, 43 frees, 74,128 bytes allocated
==4469==
==4469== LEAK SUMMARY:
==4469==    definitely lost: 0 bytes in 0 blocks
==4469==    indirectly lost: 0 bytes in 0 blocks
==4469==      possibly lost: 0 bytes in 0 blocks
==4469==    still reachable: 72,704 bytes in 1 blocks
==4469==         suppressed: 0 bytes in 0 blocks
==4469== Reachable blocks (those to which a pointer was found) are not shown.
==4469== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==4469==
==4469== For counts of detected and suppressed errors, rerun with: -v
==4469== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
bash-4.2$

 

0 Kudos
Reply