Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

instruction order of EMMS

Susumu
Beginner
222 Views
simple code below

DWORD test_convW2B( LONGLONG v )
{
__m64 mm0 = *(__m64 *)&v;
mm0 = _m_packuswb( mm0, mm0 );
DWORD ret = _m_to_int( mm0 );
_m_empty();
return( ret );
}

compiler: running on IA-32, Version 12.0.1.096 Build 20101116
with -O3 -Ob2 -Oi -Ot -Oy -Qip -FAs

generates asm code below

movq mm0, QWORD PTR [4+esp]
emms
packuswb mm0, mm0
movd eax, mm0
ret

this instruction position of EMMS is wrong.
it must be

movq mm0, QWORD PTR [4+esp]
packuswb mm0, mm0
movd eax, mm0
emms
ret

i couldn't find out a way how to avoid this problem.
0 Kudos
1 Reply
Thomas_W_Intel
Employee
222 Views
This looks like a bug to me. Could you please report it in the Intel C++ Compiler forum?

Kind regards
Thomas
0 Kudos
Reply