- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'm trying to port our large application to run natively on the Xeon Phi card.
I've compiled it but there are some problems and I think the problems are related to the compiler and that it generates bad code.
The compiler I'm using is:
$ icpc -V Intel(R) C++ Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 13.1.1.163 Build 20130313 Copyright (C) 1985-2013 Intel Corporation. All rights reserved.
And the build options are something like:
Compiler: icpc
Optimizations: -gxx-name=g++412 -gcc-name=gcc412 -O0 -g -fPIC -mmic -D_REENTRANT -static-intel -fvisibility=hidden
Link flags: -static-intel -lpthread -lrt -lc -Wl,--gc-sections -fPIC -mmic
The compiler has produced this code and the crash happens on the last instruction (rax=0x7ff15418330e and k1=1)
0x00007ff176d056f8 <init>: pushq %rbp
0x00007ff176d056f9 <init+1>: mov %rsp, %rbp
0x00007ff176d056fc <init+4>: sub $0x20, %rsp
0x00007ff176d05700 <init+8>: movq %rdi, -0x20(%rbp)
0x00007ff176d05704 <init+12>: movq -0x20(%rbp), %rax
0x00007ff176d05708 <init+16>: movq %rax, -0x18(%rbp)
0x00007ff176d0570c <init+20>: vbroadcastssl 0xb3651e(%rip), %k0, %zmm0
0x00007ff176d05716 <init+30>: movq -0x18(%rbp), %rax
0x00007ff176d0571a <init+34>: mov $0x1, %edx
0x00007ff176d0571f <init+39>: kmov %rdx, %k1
0x00007ff176d05723 <init+43>: vpackstorelps %zmm0, %k1, (%rax)
As far as I could understand from the Instruction set reference the vpackstorelps requires the address to be 64-byte aligned, which is not the case here. Am I correct in this understanding? Is this a bug in the compiler or something is wrong in my code?
The function looks like this:
void init() {
pmin.set(some_const, some_const, some_const);
pmax.set(some_const, some_const, some_const);
}
pmin and pmax are Vectors and the set function looks like this:
__forceinline void set(float ix, float iy, float iz) {
x=ix;
y=iy;
z=iz;
}
Could this __forceinline call mess up the compiler?
Also the memory for the object holding the pmin,pmax is freshly allcated by new.
Any help with this issue will be highly appreciated.
Best regards,
Teodor Petrov
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Any ideas about this issue?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Teodor,
vpackstorelps requires that memory address be aligned to at least 4 bytes. Using __forceinline is OK in this case.
The assembly that you posted implies that the the root-cause is a pointer aligned to 2-bytes passed to this function.
Thanks,
Evgueni.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Evgueni,
Is this a compiler bug or a bug in my program?
/Teodor
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Teodor P. wrote:
Is this a compiler bug or a bug in my program?
Without seeing your source it's impossible to tell. The completely abstract version you posted doesn't have the detail needed to be able even to guess at what is going on.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page