Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.

PUSH and POP of XMM/YMM registers

srinivasu
Beginner
3,520 Views

Hi,

I have written a function in that AVX2 instructions are using XMM/YMM registers. Due to use of some of these registers in this function, causing other part of application is crashing. I have observed strange behavior is that If these registers are pushed and popped as like non-volatile general purpose registers are pushed and popped.

Please help me whether, we need to push and pop the SIMD registers also. If so all XMM/YMM registers are needs to be saved and how?

Up to now I didn't read any thing about saving of the XMM/YMM registres also, but my application is working after these changes only

 

0 Kudos
1 Solution
MarkC_Intel
Moderator
3,520 Views

I like Agner Fog's document describing the calling/linkage conventions here.  See section 6 on register usage.

Some registers are "scratch" and are not expected to live across calls. Others, the callee (not caller) must save/restore if they want to mutate them inside the callee.

View solution in original post

0 Kudos
8 Replies
srinivasu
Beginner
3,520 Views

Hi,

I have written a function in that AVX2 instructions are using XMM/YMM registers. Due to use of some of these registers in this function, causing other part of application is crashing. I have observed strange behavior is that If these registers are pushed and popped as like non-volatile general purpose registers are pushed and popped, It is is not crashing and it is working fine.

Please help me whether, we need to push and pop the SIMD registers also. If so all XMM/YMM registers are needs to be saved and how?

Up to now I didn't read any thing about saving of the XMM/YMM registres also, but my application is working after these changes only

 

 

0 Kudos
Bernard
Valued Contributor I
3,520 Views

Do you mean push xmm0? You can emulate this instruction with these one:

sub ebp , 16

movdqu xmmword ptr[ebp], xmm0

0 Kudos
srinivasu
Beginner
3,520 Views

My main concern is - is it really requires the push XMM registers in to the stack/register at the start of function and pop it back to XMM registers from stack/register at the exit point?. I am not worrying about whether these are possible with PUSH/POP instructions or equivalent emulated instructions.

Is this mandatory to push and pop the SIMD registers when we are using these? if so performance degrades so much.

I am feeling that this is not required when comes to the SIMD registers, not like for general purpose registers usage. Please confirm. These are testing through SDE emulator. or this is an emulator issue?

part of not working code:

    push rbp
    mov  rbp, rsp
    push rsi

    //initially ymm8 is having some value

    mov esi, 2
    movd xmm8, esi //modifying the xmm8 register value

    vpbroadcastb ymm8, xmm8 //broadcasted the value

    pop rsi
    mov  rsp, rbp
    pop rbp
    
    ret

part of working code:

    push rbp
    mov  rbp, rsp
    push rsi

     //initially ymm8 is having some value

     vmovdqu ymm9, ymm8 //pushing the YMM8 Initial value to YMM9

    mov esi, 2
    movd xmm8, esi //modifying the xmm8 register value

    vpbroadcastb ymm8, xmm8 //broadcasted the value

    vmovdqu ymm8, ymm9 //poping the YMM8 Initial value from YMM9 to YMM8

    pop rsi
    mov  rsp, rbp
    pop rbp
    
    ret

0 Kudos
MarkC_Intel
Moderator
3,521 Views

I like Agner Fog's document describing the calling/linkage conventions here.  See section 6 on register usage.

Some registers are "scratch" and are not expected to live across calls. Others, the callee (not caller) must save/restore if they want to mutate them inside the callee.

0 Kudos
Bernard
Valued Contributor I
3,520 Views

Now I understand your question.

Btw, can you post the output from VS debugger where the application crashes?

 

 

0 Kudos
Vladimir_Sedach
New Contributor I
3,520 Views

In other words you're trying to compete with a modern C/C++ compiler?
A very bad idea. You'll lose in most cases.
The GNU C is much smarter than most of us!
Besides that the code is more clear and portable.

0 Kudos
srinivasu
Beginner
3,520 Views

Hi Mark,

Thank you very much for provided the very very useful document. Now I understand what was the problem. 

>>Btw, can you post the output from VS debugger where the application crashes?

iliayapolak, the actual implementation function is part of one of multimedia applications.Just for understanding the issue I have given the esample code. 

0 Kudos
Bernard
Valued Contributor I
3,520 Views

>>>iliayapolak, the actual implementation function is part of one of multimedia applications.Just for understanding the issue I have given the esample code>>>

Ok I thought that you could be interested in further troubleshooting.

0 Kudos
Reply