- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
we use icc 12.1 on x86-32 Linux. When compiling our code with -O2 many instructions like this are generated
[bash]movdqa XMMWORD PTR [esp+0xe0],xmm0 movaps XMMWORD PTR [esp+48],xmm7[/bash] Since the arguments to movdqa and moaps must be aligned to 16 bytes. This obviously requires that the stack is aligned to 16 bytes.
When compiling our code into a binary everything works fine. It seems like the compiler takes care of proper alignment of the stack?
However, we also compile our code into a shared object which is loaded into a Java virtual machine. Our code is then called through JNI and frequently crashes because the stack is not aligned to 16 bytes when the instruction is executed. The misaligned access results in a SIGSEGV.
The problem seems to go away when using -O1 instead of -O2, and in fact, the crashing function no longer contains movdqa/movaps in that case.
We also link object files generated with 'icc -O2' to code that is compiled and used by code compiled with g++ (an old version of g++ that does not have -mrealignstack). There the same problem could potentially arise.
Is there any way to compile with -O2 but force icc to not assume that the stack is aligned. So that we don't get the instructions that require stack alignment but still can benefit from -O2?
If not, is there a way to force the compiler to generate a sort of prologue for functions that would ensure proper stack alignment?
Or is there another way to make sure the stack is aligned properly when the offending instructions are executed?
Thanks a lot
Daniel
we use icc 12.1 on x86-32 Linux. When compiling our code with -O2 many instructions like this are generated
[bash]movdqa XMMWORD PTR [esp+0xe0],xmm0 movaps XMMWORD PTR [esp+48],xmm7[/bash] Since the arguments to movdqa and moaps must be aligned to 16 bytes. This obviously requires that the stack is aligned to 16 bytes.
When compiling our code into a binary everything works fine. It seems like the compiler takes care of proper alignment of the stack?
However, we also compile our code into a shared object which is loaded into a Java virtual machine. Our code is then called through JNI and frequently crashes because the stack is not aligned to 16 bytes when the instruction is executed. The misaligned access results in a SIGSEGV.
The problem seems to go away when using -O1 instead of -O2, and in fact, the crashing function no longer contains movdqa/movaps in that case.
We also link object files generated with 'icc -O2' to code that is compiled and used by code compiled with g++ (an old version of g++ that does not have -mrealignstack). There the same problem could potentially arise.
Is there any way to compile with -O2 but force icc to not assume that the stack is aligned. So that we don't get the instructions that require stack alignment but still can benefit from -O2?
If not, is there a way to force the compiler to generate a sort of prologue for functions that would ensure proper stack alignment?
Or is there another way to make sure the stack is aligned properly when the offending instructions are executed?
Thanks a lot
Daniel
1 Solution
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
yes, I think that's the way to go here. I'd propose to test "-falign-stack=maintain-16-byte", though. The reason is that if the stack should be, for whatever reason, already 16 byte aligned the compiler can take advantage without falling back to enforced unaligned access in that case.
I don't have a JNI example at hand right now but I'd like to hear whether "-falign-stack=maintain-16-byte" works for you as well.
Best regards,
Georg Zitzlsberger
yes, I think that's the way to go here. I'd propose to test "-falign-stack=maintain-16-byte", though. The reason is that if the stack should be, for whatever reason, already 16 byte aligned the compiler can take advantage without falling back to enforced unaligned access in that case.
I don't have a JNI example at hand right now but I'd like to hear whether "-falign-stack=maintain-16-byte" works for you as well.
Best regards,
Georg Zitzlsberger
Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It seems like add '-falign-stack=assume-4-byte' to the compiler options cures the problem.
Is this expected? Is this the right thing to do?
Thanks,
Daniel
Is this expected? Is this the right thing to do?
Thanks,
Daniel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
yes, I think that's the way to go here. I'd propose to test "-falign-stack=maintain-16-byte", though. The reason is that if the stack should be, for whatever reason, already 16 byte aligned the compiler can take advantage without falling back to enforced unaligned access in that case.
I don't have a JNI example at hand right now but I'd like to hear whether "-falign-stack=maintain-16-byte" works for you as well.
Best regards,
Georg Zitzlsberger
yes, I think that's the way to go here. I'd propose to test "-falign-stack=maintain-16-byte", though. The reason is that if the stack should be, for whatever reason, already 16 byte aligned the compiler can take advantage without falling back to enforced unaligned access in that case.
I don't have a JNI example at hand right now but I'd like to hear whether "-falign-stack=maintain-16-byte" works for you as well.
Best regards,
Georg Zitzlsberger
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks a lot for tip!
-falign-stack=maintain-16-byte seems to work as well. All the test cases that crashed before now pass (just as they did with assume-4-byte). We will go with maintain-16-byte then.
Thank you again,
Daniel
-falign-stack=maintain-16-byte seems to work as well. All the test cases that crashed before now pass (just as they did with assume-4-byte). We will go with maintain-16-byte then.
Thank you again,
Daniel
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page