unstable dtrnlspbc_solve with 64bit (bug?)

mmillet · ‎08-01-2011

Hello,

I use a code very similar to the example provided by link:

http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/cpp/lin/mkl/refman/appendices/mkl_appC_osr_dtrnlspbc.html

The optimisation works fine in 32bit, indeed by using 64bit the results become unstable. I found by using debugger that results are the same before the second call ofline:

if (dtrnlspbc_solve (&handle, fvec, fjac, &RCI_Request) != TR_SUCCESS)

where fvec and fjac are the same but the output results is unstable (the result are in the vector initialized in dtrnlspbc_init).

I was not able to found if this is a bug in the MKL or a problem with my configuration. If anyone has any idea about the problem you are welcome.

mmillet · ‎08-02-2011

My mistake. My optimisation function use some static data and it seems that jacobian matrix computation is multithreaded which lead to unstability. Does not understand why it works with 32 bit dll.

Jon_H_ · ‎12-04-2013

I have an issue that is very similar to this one. We build on one centos (5.5) machine and deploy on another centos (6.4) machine. We are using the dtrnlspbc_solve routine to find a fit to image data. Our real program calls the "solve" routine thousands of times, from run to run we get slightly different results.

I finally narrowed the issue down to a case where I call the "solve" routine twice, once right after the other. The first time though the loop the answer converges to one point, subsequent times though the loop the answer converges to another. A few things of interest:

The build machine runs and always converges to the same point (as well as other centos machines in the 5.5 - 5.9 range).
On our deployment machine the first time though the loop it converges to the "build" machines answer, the second time it converges to another answer (as well as subsequent times the loop).
Unlike the previous post, we do not have any static data in our optimization function.
I have tried to compile statically just in case but this has no effect (I was worried about the "erfc" function that we call).
We use g++ as the compiler.
I specifically disable the Open MP Threading because my ultimate goal is to thread at a higher layer.
The machines that run fine seem to be Xeon processors, the machines that fail are i7-2600.
I have tried to call MKL_FreeBuffers between calls, but this does not change the results (in our threaded application I cannot make this call until after processing).

What I would like to know is:

Has anyone seen something like this (converging to different result on different machines)?
Is there more information that I could provide to help narrow down the issue?
Is there any "state" associated with the "solve" routine that I need to clear out between calls?

Our code is proprietary so I cannot post it to a public forum but would be willing to work with "intel" directly.

Jon_H_ · ‎12-06-2013

Here is a bit more information about my problem.

I took one of our machines with a Zeon processors which was running centos 5.5 and upgraded it to centos 6.5. The code runs find on the Zeon processors, after the upgrade it still runs find. It appears to be an issue when we compile on a Zeon processor and run that code on a "i7-2600" processor.

We just upgraded to the latest, composer_xe_2013_sp1, and the issues that I was having went away. The both the Zeon and the i7-2600 processors are both internally consistent. The results vary from processor to processor but it is well within what we assume is round off error.