Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.

There are something wrong with using svml in inline ASM

zhang_y_1
Beginner
449 Views

     I try using __svml_sin2 in inline ASM like the way compiler does.  A code snippet as following,

     "vmovupd (%1), %%ymm0\n\t"
     "call __svml_sin4\n\t"
     "vmovupd %%ymm0, (%0)\n\t"
     "sub $1, %%rax\n\t"
     "jnz 3b\n\t"

    The program can build. But, the running output values are wrong.

    Then I use GDB to locate the problem. It seems that, the SVMLfunction __svml_sin4 uses the general registers rax,rbx,rcx,rdx and so on,without save the scene. So I want to save the registers modified by SVML myself. The problem is, I do not know exactly which registers are modified. Maybe different SVML function use different registers.

    So, anybody knows how to use the svml in inline assembly correctly? 

    thanks in advance for any answer.

0 Kudos
4 Replies
andysem
New Contributor III
449 Views

According to x86-64 ABI (http://www.x86-64.org/documentation/abi.pdf, section 3.2.1), only rbp, rbx and r12-r15 general purpose registers need to be preserved by the called function. All other general purpose registers can be clobbered. I believe, this is applied to all UNIX-like systems.

The convention on Windows is summarized here: http://msdn.microsoft.com/en-us/library/ms235286.aspx

If you need to preserve values of clobbered registers you should save and restore them around the function call.

 

0 Kudos
zhang_y_1
Beginner
449 Views

Hi, andysem!   

Thank you for your answer. It is very helpful.

Now the problem is that I do not know which registers are clobbered. So, if need to preserve the scence, I must save all the registers except  rbp, rbx and r12-r1. It seem to be too expensive! Do you have any idea about that?

Thanks again!

0 Kudos
andysem
New Contributor III
449 Views

You have to assume that any registers that are not required to be preserved can be clobbered. You don't have to save all registers, only those having sensible data for your program (i.e. the calling function). Compilers usually store a shadow copy of variables on the stack so that the values can be saved and restored when needed. Minimizing and scheduling these moves is one of optimizations compilers perform that you'll have to do manually in the assembler code.

0 Kudos
Vladimir_Sedach
New Contributor I
449 Views

zhang y.,

Try this. It works with me in MinGW64 and Windows.

extern "C" __m256d __svml_sin4(const __m256d &a);

__inline __m256d sin(const __m256d &a)
{
    __m256d    ret;

    __asm volatile
    (
        "vmovaps    %1, %%ymm0\n"
//        "push        %%rax\n"
//        "push        %%rax\n"
        "call        __svml_sin4\n"
//        "pop        %%rax\n"
//        "pop        %%rax\n"
        "vmovaps    %%ymm0, %0\n"
        : "=m"(ret) : "m"(a) : "%xmm0"
    );
    return ret;
}

    __m256d    src, ret;
    ret = sin(src);

If something's wrong, try uncomment push/pop.

 

 

0 Kudos
Reply