- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am developing a dynamic library that is compiled using icc V17, that will eventually be designed to run on either KNL or other Intel architectures.
I am building the dynamic library with the -axMIC-AVX512 flag. The library I am building (see source code) contains a single function - "test_method". After it is built the symbol table shows me that internally there are a number of variants, which I assume to be used depending on whether or not it is running on AVX512 compatible hardware:
16: 0000000000000810 64 FUNC GLOBAL DEFAULT 12 test_method
47: 0000000000000850 304 FUNC LOCAL DEFAULT 12 test_method.Z
48: 0000000000000980 192 FUNC LOCAL DEFAULT 12 test_method.A
62: 0000000000000810 64 FUNC GLOBAL DEFAULT 12 test_method
The code which calls test_method runs fine on KNL hardware, but on some non KNL machines the code segfaults. A debugger shows that even upon entering the function, a garbage function parameter is used, which is worrying. As you can see in the source code the parameter was c=9. However the debugger shows this (c=404743 ??), right before failing:
#0 0x00007f1e37d90792 in test_method (c=404743) at avxdynamiclib.c:4
If I modify the function so that when compiled it is not optimised / split according to the symbol table then everything works fine.
Any wisdom would be very appreciated.
The library code, along with the compilation command is:
// File: avxdynamiclib.c // icc -axMIC-AVX512 -g -fPIC -O2 -o avxdynamiclib.o -c avxdynamiclib.c; readelf -a avxdynamiclib.o | grep 'test_method'; icc -m64 -shared -fPIC -static-libgcc avxdynamiclib.o -o libavxtest.so int test_method(int c) { int i = 0, rc = 0; for (i = 0; i < c; i++) { rc += c + sizeof(int); } return rc; }
The driver code, along with the compilation and execution command is:
// File: libtest.c // > icc -axMIC-AVX512 -g -fPIC -O2 -L./ -lavxtest libtest.c -o libtest // > LD_LIBRARY_PATH=$PWD:$LD_LIBRARY_PATH ./libtest int main(int argc, char *argv[]) { test_method(9); return 0; }
- Tags:
- Parallel Computing
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sergey,
b is referenced in the fprintf statement (pointer value is printed).
Corey,
On optimized code the arguments (~4) are generally passed via registers. Stack space was reserved by the caller, filled in in Debug build, but not in Release mode. Symbolic debugging of registerized arguments can be erroneous when not stepping in to the function via the debugger.
I seem to recall the IPO had some issues with arguments such as char, short, etc... Maybe this is one of those cases. The temporary solution was to exclude those files from IPO.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
> It is Not used in processing and I suspect it should be.
Sergey, the function shown is not used practically, it it just a very cut down version of an actual function to illustrate the error.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
> I seem to recall the IPO had some issues with arguments such as char, short, etc... Maybe this is one of those cases. The temporary solution was to exclude those files from IPO.
Hi Jim. Thanks for the response. I did try switching all function arguments to int's and the segfault still occurred. Just to remove all ambiguity, I modified the source in the thread topic to contain an even more cut down version of the source.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The version demonstrated the issue perfectly well; it leads to a re-producible segfault with minimal code.
I finally tracked down the issue. It turns out there was a version issue with the runtime libraries that the library was linking at execution. With that fixed the code runs fine without a segfault.
Thanks for your help folks!
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page