For difference of speed between fsincos and call __libm_sse2_sincos, here's the code test:
the generate code made by cos ans sin block, generate 3 * call __libm_sse2_sincos, that's why i put only 3 * fsincos.
I have try to calling __libm_sse2_sincos through asm inline, but he didn't find this function, and made error at compilation :/
float rotation_object[4] = { 0 }; // 0: x , 1: y , 2: z
void put_pixel(float *coord, int color, float* a)
{
int offset_pixel;
int a_time_sincosps, b_time_sincosps;
int a_time_fsincos, b_time_fsincos;
unsigned int loop_test;
int temp;
__asm rdtsc
__asm mov [a_time_sincosps], eax
trigo.cos[_x] = cos(DEG2RAD(rotation_object[_x]));
trigo.cos[_y] = cos(DEG2RAD(rotation_object[_y]));
trigo.cos[_z] = cos(DEG2RAD(rotation_object[_z]));
trigo.sin[_x] = sin(DEG2RAD(rotation_object[_x]));
trigo.sin[_y] = sin(DEG2RAD(rotation_object[_y]));
trigo.sin[_z] = sin(DEG2RAD(rotation_object[_z]));
__asm rdtsc
__asm mov [b_time_sincosps], eax
printf("time_sincosps = %i\n", b_time_sincosps - a_time_sincosps);
__asm rdtsc
__asm mov [a_time_fsincos], eax
__asm vxorpd xmm1, xmm1, xmm1
__asm vcvtss2sd xmm1, xmm1, [temp]
__asm vmovsd xmm15, [temp]
__asm vmulsd xmm0, xmm15, xmm1
__asm vzeroupper
__asm fsincos
__asm vxorpd xmm2, xmm2, xmm2
__asm vmovapd xmm14, xmm0
__asm vcvtss2sd xmm2, xmm2, [temp]
__asm vcvtsd2ss xmm1, xmm1, xmm1
__asm vmulsd xmm0, xmm15, xmm2
__asm vmovss[temp], xmm1
__asm fsincos
__asm vxorpd xmm2, xmm2, xmm2
__asm vmovapd xmm13, xmm0
__asm vcvtss2sd xmm2, xmm2, [temp]
__asm vcvtsd2ss xmm1, xmm1, xmm1
__asm vmulsd xmm0, xmm15, xmm2
__asm vmovss [temp], xmm1
__asm fsincos
__asm vcvtsd2ss xmm1, xmm1, xmm1
__asm vcvtsd2ss xmm14, xmm14, xmm14
__asm vcvtsd2ss xmm13, xmm13, xmm13
__asm vcvtsd2ss xmm0, xmm0, xmm0
__asm vmovss [temp], xmm1
__asm vmovss [temp], xmm14
__asm vmovss [temp], xmm13
__asm vmovss [temp], xmm0
__asm rdtsc
__asm mov [b_time_fsincos], eax
printf("time_fsincos = %i", b_time_fsincos - a_time_fsincos);
while (1);
}
I have add store anc convert function because it's the code made by trigo.cos[_x] = ... and other stores.
can't do only cos(DEG2RAD(rotation_object[_x])); because he remove call __libm_sse2_sincos.
And here the result:
time_sincosps = 19839
time_fsincos = 1981
And, is it possible to modify the processus of compilation by rewrite asm file ? it will be wonderfull if it's possible, i will be able to (correct) the code or rewrite it like i think.
Finnaly is it to possible to personalize the asm code make by icl ? ex:
Keep the display of number, like hexadecimal, and write it with 0x instead H at the end of number:
return 0xdeadbeef --> (origin) mov eax, -559038737 --> (wish) mov eax, 0xdeadbeef ^^