- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I started playing with the codes available from https://software.intel.com/en-us/articles/benefits-of-intel-avx-for-small-matrices. ; I got the Determinant4x4Matrices.cpp code to compile by replacing <gmmintrin.h> with <immintrin.h>. The code runs fine when compiled with g++, but drops core when compiled with icpc. I see the same behaviour under fedora 21 and SLES11sp3.
uname -a
Linux XXXXXX 4.1.13-100.fc21.x86_64 #1 SMP Tue Nov 10 13:13:20 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
g++ --version
g++ (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)
g++ -mavx Determinant4x4Matrices.cpp
./a.out
Welcome to Determinat4x4Matrices
256-bit results matched for evaluation of a determinant
128-bit results matched for evaluation of a determinant
icpc -V
Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 16.0.1.150 Build 20151021
icpc -xAVX Determinant4x4Matrices.cpp
./a.out
Welcome to Determinat4x4Matrices
256-bit results matched for evaluation of a determinant
128-bit results matched for evaluation of a determinant
Segmentation fault (core dumped)
GDB says .....
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000003a40483a18 in _int_free (have_lock=0, p=<optimized out>, av=0x3a407b7cc0 <main_arena>) at malloc.c:3990
3990 unlink(av, nextchunk, bck, fwd);
(gdb) where
#0 0x0000003a40483a18 in _int_free (have_lock=0, p=<optimized out>, av=0x3a407b7cc0 <main_arena>) at malloc.c:3990
#1 __GI___libc_free (mem=<optimized out>) at malloc.c:2951
#2 0x00000000004013ea in DeAllocateBuffers () at Determinant4x4Matrices.cpp:116
#3 0x00000000004048ff in main (argc=1, argp=0x7ffd0b400868) at Determinant4x4Matrices.cpp:627
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The code sample you are using had been posted a few years ago but should be sound. I will try and duplicate the behavior on my side and see if I ran into the same issues.
Thank you for flagging it.
Noga
Intel Developer Support
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To make sure I am using exactly the same code you are using, can you please attach it to this thread?
Thanks,
Noga
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I was able to duplicate your issue on my side using the attached code sample. It seem to be an issue with the Intel Linux* compiler. The same code was fine using the Intel C++ compiler in Windows*.
I have escalated the issue to our engineers DPD200379563
Thank you for reporting it.
Regards,
Noga
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This issue had been marked 'as not a defect'. Please see comments below.
It seems like it is a test issue.
I just expanded memory size allocated for OutputBuffer and test started to pass.
bash-4.2$ diff Determinant4x4Matrices.cpp Determinant4x4Matrices_.cpp
4c4,6
< #include <gmmintrin.h>
---
> #include <immintrin.h>
> #include <stdlib.h>
>
106c108
< OutputBuffer = (float*) _mm_malloc(iBufferSizeBytes, AlignmentBytes); //should be 32bytes alignment for AVX
---
> OutputBuffer = (float*) _mm_malloc(iBufferSizeBytes+48, AlignmentBytes); //should be 32bytes alignment for AVX
bash-4.2$
Some details.
As I noted earlier, the test case always fails with GCC 5.3.0, so I will show the issue using this version. When I use ICC the issue is unstable.
bash-4.2$ g++ --version
g++ (GCC) 5.3.0
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
bash-4.2$ cat Determinant4x4Matrices_.cpp | grep Out | grep mal
OutputBuffer = (float*) _mm_malloc(iBufferSizeBytes, AlignmentBytes); //should be 32bytes alignment for AVX
bash-4.2$ g++ -mavx Determinant4x4Matrices_.cpp
bash-4.2$ ./a.out
Welcome to Determinat4x4Matrices
256-bit results matched for evaluation of a determinant
128-bit results matched for evaluation of a determinant
*** Error in `./a.out': free(): invalid next size (fast): 0x0000000001d3ee80 ***
...
Aborted (core dumped)
valgrind reports about invalid write at sse_determinant() function.
bash-4.2$ valgrind --leak-check=yes ./a.out
==4429== Memcheck, a memory error detector
==4429== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==4429== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==4429== Command: ./a.out
==4429==
Welcome to Determinat4x4Matrices
256-bit results matched for evaluation of a determinant
==4429== Invalid write of size 8
==4429== at 0x4024BB: sse_determinant() (in /nfs/ins/proj/icl/lrb/users/agrische/tests/379563/a.out)
If it related to call of _mm_store intrinsic at the end of the function.
If I expand memory size for OutputBuffer then the error leaves.
bash-4.2$ !ca
cat Determinant4x4Matrices_.cpp | grep Out | grep mal
OutputBuffer = (float*) _mm_malloc(iBufferSizeBytes+48, AlignmentBytes); //should be 32bytes alignment for AVX
bash-4.2$ g++ -mavx Determinant4x4Matrices_.cpp
bash-4.2$ ./a.out
Welcome to Determinat4x4Matrices
256-bit results matched for evaluation of a determinant
128-bit results matched for evaluation of a determinant
bash-4.2$ valgrind --leak-check=yes ./a.out
==4469== Memcheck, a memory error detector
==4469== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==4469== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==4469== Command: ./a.out
==4469==
Welcome to Determinat4x4Matrices
256-bit results matched for evaluation of a determinant
128-bit results matched for evaluation of a determinant
==4469==
==4469== HEAP SUMMARY:
==4469== in use at exit: 72,704 bytes in 1 blocks
==4469== total heap usage: 44 allocs, 43 frees, 74,128 bytes allocated
==4469==
==4469== LEAK SUMMARY:
==4469== definitely lost: 0 bytes in 0 blocks
==4469== indirectly lost: 0 bytes in 0 blocks
==4469== possibly lost: 0 bytes in 0 blocks
==4469== still reachable: 72,704 bytes in 1 blocks
==4469== suppressed: 0 bytes in 0 blocks
==4469== Reachable blocks (those to which a pointer was found) are not shown.
==4469== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==4469==
==4469== For counts of detected and suppressed errors, rerun with: -v
==4469== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
bash-4.2$

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page