Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.
1135 Diskussionen

Difference between mm512_scalef_ps and mm512_scalef_round_ps

simba611
Einsteiger
2.223Aufrufe

While trying to emulate mm512_scalef_ps and mm512_scalef_round_ps, I noticed that there is no difference in the pseudocode provided in the Intel Intrinsics Guide.
Could it be clarified what the difference is between in the two intrinsics ?

0 Kudos
3 Antworten
AdyT_Intel
Moderator
2.183Aufrufe

The mm512_scalef_ps does the rounding based on the rounding mode in the MXCSR.

The mm512_scalef_round_ps gets an additional rounding control parameter and use this rounding instead of the default (in MXCSR).
You can clearly see this in the description below (taken from the Intel intrinsic guide).

Scale the packed single-precision (32-bit) floating-point elements in a using values from b, and store the results in dst.
Rounding is done according to the rounding[3:0] parameter, which can be one of:

(_MM_FROUND_TO_NEAREST_INT |_MM_FROUND_NO_EXC) // round to nearest, and suppress exceptions (_MM_FROUND_TO_NEG_INF |_MM_FROUND_NO_EXC) // round down, and suppress exceptions (_MM_FROUND_TO_POS_INF |_MM_FROUND_NO_EXC) // round up, and suppress exceptions (_MM_FROUND_TO_ZERO |_MM_FROUND_NO_EXC) // truncate, and suppress exceptions _MM_FROUND_CUR_DIRECTION // use MXCSR.RC; see _MM_SET_ROUNDING_MODE
nemequ
Neuer Beitragender I
2.171Aufrufe

The rounding you're talking about is the FLOOR() function in the pseudo-code, or is it something in addition to the FLOOR()?

Assuming the former, instead of FLOOR(tmp_src2[31:0]) shouldn't the pseudo-code be ROUND(tmp_src2[i+31:i], rounding[3:0]) for scalef_round and something like ROUND(tmp_src2[i+31:i], MXCSR[2:0] | _MM_FROUND_NO_EXC) for scalef?

simba611
Einsteiger
2.074Aufrufe

I have since tried using _mm_setcsr to set the rounding mode before calling scalef to emulate scalef_round behaviour.
I was unable to emulate scalef_round using this.

Antworten