Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!
For the latest information on Intel’s response to the Log4j/Log4Shell vulnerability, please see Intel-SA-00646

Underflow exception when using vmsExp

Johannes_B_1
Beginner
162 Views

The following code generates an exception, although I set the error mode to VML_ERRMODE_IGNORE. Is there a way to use high accuracy (VML_HA) computations without raising underflow exceptions when computing exp (-88.) ?

float testData[1] = {-88.77909088};
vmsExp (1, testData, testData, VML_HA | VML_FTZDAZ_ON | VML_ERRMODE_IGNORE);


I am using the Intel MKL version 11.3.2 with Visual Studio 2013, statically linking mkl_intel_c.lib, mkl_core.lib and mkl_sequential.lib into my project.

0 Kudos
1 Solution
Jingwei_Z_Intel
Employee
162 Views

I think the issue is related to the control mode set in the MXCSR register at runtime. By default, it's set to 0x9fc0 (i.e. FTZ and DAZ on) on Windows, that may have caused the hardware instructions used by vmsExp to set the underflow exceptions unexpectedly. If you change MXCSR value to 0x1f80, that should solve your problem. Here is my test program and results:

#include <mkl.h>
#include <stdio.h>
#include <errno.h>
#include <immintrin.h>

int main() {
    float testData[] = {-88.77909088}; 
    printf("input  = %f = 0x%08x\n", testData[0], ((int*)testData)[0]);
    errno = 0;
    printf("errno  = %d\n", errno);
    //_mm_setcsr(0x1f80);
    printf("mxcsr  = %x\n", _mm_getcsr());
    vmsExp (1, testData, testData, VML_HA | VML_FTZDAZ_ON | VML_ERRMODE_IGNORE);
    printf("result = %f = 0x%08x\n", testData[0], ((int*)testData)[0]);
    printf("errno  = %d\n", errno);
    printf("mxcsr  = %x\n", _mm_getcsr());
    return 0;
}

Default result:

input  = -88.779091 = 0xc2b18ee5
errno  = 0
mxcsr  = 9fc0
result = 0.000000 = 0x00000000
errno  = 0
mxcsr  = 9ff1

If $MXCSR is set to 0x1f80:

input  = -88.779091 = 0xc2b18ee5
errno  = 0
mxcsr  = 1f80
result = 0.000000 = 0x00000000
errno  = 0
mxcsr  = 1f80

Please let me know if this helps.

View solution in original post

2 Replies
Jingwei_Z_Intel
Employee
163 Views

I think the issue is related to the control mode set in the MXCSR register at runtime. By default, it's set to 0x9fc0 (i.e. FTZ and DAZ on) on Windows, that may have caused the hardware instructions used by vmsExp to set the underflow exceptions unexpectedly. If you change MXCSR value to 0x1f80, that should solve your problem. Here is my test program and results:

#include <mkl.h>
#include <stdio.h>
#include <errno.h>
#include <immintrin.h>

int main() {
    float testData[] = {-88.77909088}; 
    printf("input  = %f = 0x%08x\n", testData[0], ((int*)testData)[0]);
    errno = 0;
    printf("errno  = %d\n", errno);
    //_mm_setcsr(0x1f80);
    printf("mxcsr  = %x\n", _mm_getcsr());
    vmsExp (1, testData, testData, VML_HA | VML_FTZDAZ_ON | VML_ERRMODE_IGNORE);
    printf("result = %f = 0x%08x\n", testData[0], ((int*)testData)[0]);
    printf("errno  = %d\n", errno);
    printf("mxcsr  = %x\n", _mm_getcsr());
    return 0;
}

Default result:

input  = -88.779091 = 0xc2b18ee5
errno  = 0
mxcsr  = 9fc0
result = 0.000000 = 0x00000000
errno  = 0
mxcsr  = 9ff1

If $MXCSR is set to 0x1f80:

input  = -88.779091 = 0xc2b18ee5
errno  = 0
mxcsr  = 1f80
result = 0.000000 = 0x00000000
errno  = 0
mxcsr  = 1f80

Please let me know if this helps.

View solution in original post

Johannes_B_1
Beginner
162 Views

Thank you very much!

The solution works, although I figured out after a while, that the FTZ and DAZ flags do not seem to have something to do with it. I rather have to set the "invalid operation mask" to get the code working, using the following instructions:

_mm_setcsr (_mm_getcsr() | _MM_MASK_INVALID);

Also, in my case, the default value of the MXCSR register seems to be 0x1920 instead of 0x9FC0 and may have something to do with the compiler optimization options.

Reply