Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

SSE

send4chandu
Beginner
751 Views
Hi ,
I am trying to run a c++ code in VC++ 6.0 environment (with processor pack installed). My code contains few assembly language instrucions to execute SSE. My code is as follows.
#include
#define ARRSIZE 400
void main()
{
__declspec(align(16)) float a[ARRSIZE], b[ARRSIZE], c[ARRSIZE];

_asm
{
push esi;
push edi;
mov edi, a;
mov esi, b;
mov edx, c;
mov ecx, 100;
loop:
movaps xmm0, [edi];
movups xmm1, [esi];
mulps xmm0, xmm1;
movups [edx], xmm0;
add edi, 16;
add esi, 16;
add edx, 16;
dec ecx;
jnz loop;
pop edi;
pop esi;
}
}
When I run the above code, I am getting following 3 errors.
--------------------Configuration: mul - Win32 Debug--------------------
Compiling...
mul.cpp
c:program filesmicrosoft visual studiovc98includemmintrin.h(29) : error C2146: syntax error : missing ';' before identifier '_m_from_int'
c:program filesmicrosoft visual studiovc98includemmintrin.h(29) : error C2501: '__m64' : missing storage-class or type specifiers
c:program filesmicrosoft visual studiovc98includemmintrin.h(29) : fatal error C1004: unexpected end of file found
Error executing cl.exe.
mul.obj - 3 error(s), 0 warning(s)
-------------------------------------------------------------------------------------------
Can any one can solve my problem.
-Thanks in advance.
-Ravi Akkenapally
0 Kudos
2 Replies
Vladimir_Dudnik
Employee
751 Views
Ravi,
could you try this with Intel compiler?
Regards,
Vladimir
0 Kudos
ids-removed222
Beginner
751 Views
I changed movaps to movups , it pass
but new problem emerge at end
#define ARRSIZE 4
void main1()
{
//__declspec(align(16)) float a[ARRSIZE], b[ARRSIZE], c[ARRSIZE];
float * a = new float[ARRSIZE];
float * b = new float[ARRSIZE];
float * c = new float[ARRSIZE];

_asm
{
push esi;
push edi;
push edx
mov edi, a;
mov esi, b;
mov edx, c;
mov ecx, ARRSIZE;
L1:
movups xmm0, [edi];
movups xmm1, [esi];
mulps xmm0, xmm1;
movups [edx], xmm0;
add edi, 16;
add esi, 16;
add edx, 16;
;dec ecx;
;jnz loop1;
loop L1
pop edx;
pop edi;
pop esi;

}
delete [] a;
delete [] b;
delete [] c; // only wrong here
}
0 Kudos
Reply