We upgraded our MKL a couple of months ago from 11.3.3 and have been plagued by inconsistent results in some runs of our software. For most things we analyze, results are good, but for some small subset of problems, the results change from run-to-run and sometimes are quite incorrect. We've localized the the problem to the double precision version of the GESVD. It seems to occur in both real and complex versions as well as in both threaded and sequential libraries. Single precision SVDs seems fine. The inconsistencies seem somewhat related to the work vector as changing the size of the work vector can cure the problem in some cases (or make it worse).
Attached is a test case with test matrix and sample output from MKL 11.3.3 and 2018.1.163 on linux. We've also confirmed the bug in MKL 2017.0.4 on windows.
Any help is appreciated. We've devoted a lot of man hours in the past month or so tracking this problem down.
Regarding incorrect double-complex result: the MKL-computed workspace in ZGESVD in 2018 update 1 is too small. This issue is fixed in MKL 2018 u2, which was released on March 14th. Can you try this update with your software?
Regarding inconsistent results: as you have noticed, a larger workspace gives different results due to a change in algorithm. The reconstructed matrix is the same, but the factors U and V are different depending on the workspace size, due to non-uniqueness of SVD decomposition (some signs are switched). This is to be expected.
If you need the same results from run to run, you can use the environment variable MKL_CBWR, which guarantees reproducible results (providing that all parameters are exactly the same).
Thank you. Updating the 2018 Update 2 solved the problem above in addition to solving a pzgemm MPI wait error that has been plaguing us.
There is no mention of the SVD bug fix that we can see in the 2018 Update 2 release notes or we may have updated sooner. Does Intel provide a bug tracker or something similar that would allow us to see known bugs or possible bugs that are being worked on? We certainly spent a lot of time on tracking down this issue and then trying to determine if it was our code or an MKL issue. If we had known that the issue was known and that a fix was coming, we could have saved a lot of effort.
Good to hear that the latest version has resolved the issues.
We do not have a bug tracker for bugs that we are working on. The release notes and known limitation sections for releases are the correct places to look for bug fixes and issues, I will get this issue added to the correct section for update 1/update 2, and I will bring your feedback regarding communication to customers for known bugs to the team.