Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

DNN convolution does not preserve XMM registers as it should.

gustafsson__bengt
1,135 Views

When running a MKL_DNN operation previously created using dnnConvolutionCreateForwardBias_F32() the XMM7-8 registers (and possibly more) are not preserved.

I have code similar to this (but of course much more elaborate) which does not work in relase mode:

void RunConvolution() {
    dnnPrimitive_t handle;

    dnnConvolutionCreateForwardBias_F32(&handle, --- other pars ---);

    void* resources[dnnResourceNumber];
    --- fill in pointers to buffers, kernel and bias ---

    dnnExecute_F32(handle, resources);
}


void main()
{
    Timer timer;
    for (o =0; o < 10; o++) {
        for (int i = 0; i < 10; i++)
            RunConvolution();

        cout << timer.Elapsed() / 10.0 << endl;
    }
}

I am using the Visual Studio 2017 compiler and I noted it uses XMM7 to store the constant 10.0 _over_ the entire loops which calls RunConvolution(). I tracked the contents of the register and noted that it is destroyed by dnnExecute_F32() but that's just a thin wrapper so the actual problem is probably in some assembly code in the function redirected to when the handle refers to a structure created by dnnConvolutionCreateForwardBias_F32().

According to Microsoft documentation XMM6 and up are to be preserved over function calls, so the numbers printed are bogus. I can't see any other explanation for this than that the calling convention has been breached by the MKL DNN code.

I am using MKL 2018.1 version, 64 bit, static linking with intel threading (not TBB).

I have not tested other operations except dnnConvolutionCreateForwardBias_F32.

 

 

 

 

0 Kudos
6 Replies
gustafsson__bengt
1,135 Views

dnnConversionExecute_F32() also does not preserve registers properly.

0 Kudos
Ying_H_Intel
Moderator
1,135 Views

Hi Gustafsson, bengt

Thank you for raising the question.  So as i understand, that  the original problem is  the code won't work in release mode, the possible reason is XMM7-8 registers (and possibly more) are not preserved.  As. https://stackoverflow.com/questions/262162/why-did-windows-64-choose-to-require-xmm6-and-xmm7-to-be-saved-restored

Could you please tell what kind of CPU you are using? does the code work for other machine?

Best Regards,

Ying

 

0 Kudos
gustafsson__bengt
1,135 Views

The issue is, as you presumed, that you violate the register saving convention that XMM6-7 have to be preserved, and that Microsoft compilers (both 2013 and 2017) rely on these registers actually being preserved.

I'm running a Intel i7-8700.

 

0 Kudos
Ying_H_Intel
Moderator
1,135 Views
Hi Gustafssion, Yeah, there is such a problem. We fixed it in Intel MKL 2018 U2 and above, which is supposed to be available soon. You may check the forum announcement Thank you Ying
0 Kudos
gustafsson__bengt
1,135 Views

I tested this with MKL2018.2 and it fixed the problem.

Thanks for the good work!

0 Kudos
Ying_H_Intel
Moderator
1,135 Views

Hi Gustafsson,

Thank you a lot for the confirmation!

Thanks,
​Ying

 

 

 

0 Kudos
Reply