- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I use icc (ICC) version 16.0.2 (20160204). I found a bug in the way its MPX transformation pass creates bounds for SSE-heavy (and heavily-optimized) code. My computer has an Intel Skylake CPU.
Here is the minimal test case that reproduces the problem (adapted from Vips program where the bug was triggered):
#define SCALE (1<<6) float ar[SCALE + 1][SCALE + 1][4]; void __attribute__ ((noinline)) foo() { int x, y; for( x = 0; x < SCALE + 1; x++ ) for( y = 0; y < SCALE + 1; y++ ) { double X, Y, Xd, Yd; double c1, c2, c3, c4; X = (double) x / SCALE; Y = (double) y / SCALE; Xd = 1.0 - X; Yd = 1.0 - Y; c1 = Xd * Yd; c2 = X * Yd; c3 = Xd * Y; c4 = X * Y; ar[0] = c1; ar [1] = c2; ar [2] = c3; ar [3] = c4; } } int main() { foo(); return ar[0][0][0]; }
The code raises an exception when built with O2 and -no-check-pointers-narrowing (exactly this combination on my computer):
>>> icc -O2 -ggdb -check-pointers-mpx=rw -no-check-pointers-narrowing -lmpx -lmpxwrappers vipstest.c >>> ./a.out Saw a #BR! status 1 at 0x400c26 Saw a #BR! status 1 at 0x400c2e ... # now with O1: works correctly >>> icc -O1 -ggdb -check-pointers-mpx=rw -no-check-pointers-narrowing -lmpx -lmpxwrappers vipstest.c >>> ./a.out [ no output ] # now without no-check-pointers-narrowing >>> icc -O2 -ggdb -check-pointers-mpx=rw -lmpx -lmpxwrappers vipstest.c >>> ./a.out [ no output ]
The offending asm snippet looks like this:
bndmk 0x13(%rdx),%bnd1 # INCORRECT BOUND: TRIGGERS BR bndmk 0x1080f(%rdx),%bnd0 # CORRECT BUT UNUSED BOUND ... bndcl 0x603904(%rdi),%bnd1 bndcl 0x603908(%rdi),%bnd1 bndcl 0x60390c(%rdi),%bnd1 bndcu 0x603917(%rdi),%bnd1 # TRIGGERS BR bndcu 0x60391b(%rdi),%bnd1 bndcu 0x60391f(%rdi),%bnd1 ...
Note that when compiled with O1 (or without no-check-pointers-narrowing), the asm uses the correct BND0 register. Clearly, some autovectorization (SSE) optimization pass clashes with the MPX instrumentation.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, Dmitri
Have you tried your test with Intel Compiler 17.0? I cannot reproduce the error with 17.0 on Windows.
I need to find a Linux Skylake system to verify your issue. I will let you know when I have an update on this.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I did not try ICC 17.0. From my side, I will try to update to ICC 17.0 and report the results on my Linux Skylake machine.
UPDATE: I installed ICC 17.0, and the bug disappeared. Great! (I guess this was expected since ICC 17.0 has better autovectorization support.) So I guess this bug report can be closed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, Dmitrii
I'm glad to hear this. That's great. Thank you for letting me know this.
I'm closing this as the issue is fixed in 17.0.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I believe I see the manifestations of this bug in my other programs now (even after updating to ICC 17).
I see these bugs in SPEC 2006: vips, h264ref, and milc. I will try to come up with another test case that can be reproduced in ICC 17. In the meantime, have you tried running MPX instrumentation on SPEC 2006 under Ubuntu 16.04 + Intel Skylake?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, Dmitri
Thank you for the update.
Have you created the new test case to reproduce with 17.0? We are interested in reproducing the issue and submit it for a resolution.
Thanks.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page