Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

inline asm call

vpanait
Beginner
389 Views
Hi,

I am forced to use VS2005/64b by the SOW, and I need to use some SSSE3 and SSE4 intrinsics.
Google-ing it, I found out that these intrinsics are not supported in VS2005, thus I tried to mimic their action, for example:
_ct_shuffle_epi8 PROC
movdqa xmm0, [RCX] ; load param 1
pshufb xmm0, XMMWORD PTR [RDX] ; shuf with param 2
ret
_ct_shuffle_epi8 ENDP
for the pshufb function from SSSE3.
In the C code I reffer it with:
extern "C"
{
inline __m128i _ct_shuffle_epi8(__m128i a, __m128i b);
}
All works good, with the observation that is far from optimal and the masm generates the following asm code when calling_ct_shuffle_epi8 () :
000c6 66 0f 7f 44 2420 movdqa XMMWORD PTR $T6999[rsp], xmm0
000cc e8 00 00 00 00 call _ct_shuffle_epi8
000d1 66 0f 73 df 08 pslrdq xmm7, 8
000d6 48 8d 54 24 30 lea rdx, QWORD PTR $T7002[rsp]
000db 48 8d 4c 24 20 lea rcx, QWORD PTR $T7001[rsp]
As one can see it generates a call to the function, instead of inlining it.
Given the fact VS2005/64b does not support inline _asm .. is there a way to inline the whole asm procedure? (like in the example above)
Or any other decent way to include some of the SSSE3 intrinsics in the VS2005 code?
Thank you.
0 Kudos
2 Replies
vpanait
Beginner
389 Views
found the answer, maybe will help somebody ..
microsoft compiler doesn't have support under vs2005 for ssse3+ intrinsics
intel compiler does support them .. so just install the intel compiler and you will have all support for them
good luck
0 Kudos
jimdempseyatthecove
Honored Contributor III
389 Views
Plan a) You could compile your C/C++ code and produce a .ASM source file, then insert (inline) the _ct_shuffle_epi8 into the code. (this can be automated)

Plan b) write call _ct_shuffle_epi8 such that it back patches the call with the appropriate instruction code sequence. Note, you may need to have your C/C++ code use a macro and then make two calls to _ct_shuffle_epi8 in order to provide sufficient bytes in the code stream to insert your patch (plus a few NOOPs).

(Won't work on systems that protect the code segment from patching.)

Jim Dempsey
0 Kudos
Reply