- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I probably met a bug in one of the MKL (10.2.5.035) subroutine, ZGESVD. I am linking to the multithreaded version of MKL statically. The ZGESVD give me wrong results when I use 32 threads. It gives me correct result if I use one thread. I have a simple non-openmp program that loads in a matrix and carries out the SVD operation to produce this error consistently. The test program, makefile and the data file are all in the attachment.
The test program shows that the first svd call produce correct result. The second svd call is to find out the optimal work size. The third call produce wrong result. This may indicate that the work size is giving the problem. However, wrong results will be produce for other matrices even for the first svd call.
When I use less threads, for example, 16 or 8 or 1, the result is correct for the matrix attached.
This test is made on a Linux node with 32 cores.
Any suggestions?
Thanks & Regards,
Xin
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your reply!
I need to get our system administrator's assistance in order to try a different version of MKL. 10.2.5.035 is the latest one installed for now. They are planning to install 10.3 some time this week.
By the way, I downloaded an evaluation version of MKL 10.3 for Linux and tried to install as a user (not root). But the installation seems stalled after EULA appeared and I typed 'accept' and enter. Any ideas on this problem? I can try 10.3 right away if I can install it successfully.
Regards,
Xin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Xin,
The installation issue you have described looksan activation problem. The installer reads all Intel license keys registered on your system to make a decision about your current activation level. In case your system (or shared location) has significant amountof licenses this process could take some time.
Could you kindly start installation one more time and let it scan your system for a long time period, like a 30-40 minutes please? If it still freezes please interrupt it and, if possible, send us a log files /tmp/*.issa*.log and /tmp/*.pset*.log (please select correct one sorting by modification time)
We will investigate the issue and return to you with instructions.
As a temporary workaround you could try to backup and then cleanup folders /opt/intel/licenses and $HOME/intel/licenses (please ask for a root assistance if you have no enough permissions) and restart the installation.
Waiting for your reply.
Thank you,
- Nikolay
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your reply! I started the installation yesterday and it is still freezing there. Since Gennady has tested with all the new versions of MKL, I am not going test it again by myself for now. And, our system administrator is going to install the latest MKL, I will just wait for that.
However, I am interested to know what the problem is in case I need to install again. Attached are the log files. There are a few similar log files possibly for my other attempts.
Gennady, Thanks for your info! Hope the fix is a simple one and comes soon.
Thanks all!
Xin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Xin,
Thank you for the information.
The log file shows the freeze at the activation checking, so the initial presumption was correct.
Could you kindly call three commands at you system and send the output please?
1) ldd"/home/xxu2/src/mfd/MFD3/MFD/Make/test/l_mkl_10.3.2.137_intel64/./pset/32e/../chklic/32e/chklic"
2) ls ls /share/apps/intel/ict/Compiler/11.1/072/licenses
3) export LD_LIBRARY_PATH=/home/xxu2/src/mfd/MFD3/MFD/Make/test/l_mkl_10.3.2.137_intel64/./pset/32e/../chklic/32e; "/home/xxu2/src/mfd/MFD3/MFD/Make/test/l_mkl_10.3.2.137_intel64/./pset/32e/../chklic/32e/chklic" -f"MKernL" -f"MKern" -p"i86_r" -p"i86_re" -p"it64_lr" -p"it64_re" -p"amd64_re" -c"/share/apps/intel/ict/Compiler/11.1/072/licenses"
Thank you very much for your time,
- Nikolay
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here is the output:
[xxu2@dlxlogin2 ~]$ ldd "/home/xxu2/src/mfd/MFD3/MFD/Make/test/l_mkl_10.3.2.137_intel64/./pset/32e/../chklic/32e/chklic"
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003734a00000)
libm.so.6 => /lib64/libm.so.6 (0x0000003734200000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003740c00000)
libc.so.6 => /lib64/libc.so.6 (0x0000003733e00000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003734600000)
/lib64/ld-linux-x86-64.so.2 (0x0000003733a00000)
[xxu2@dlxlogin2 ~]$ ls -ls /share/apps/intel/ict/Compiler/11.1/072/licenses
16 -rw-r--r-- 1 root root 551 Jul 21 2010 /share/apps/intel/ict/Compiler/11.1/072/licenses
[xxu2@dlxlogin2 ~]$ export LD_LIBRARY_PATH=/home/xxu2/src/mfd/MFD3/MFD/Make/test/l_mkl_10.3.2.137_intel64/./pset/32e; "/home/xxu2/src/mfd/MFD3/MFD/Make/test/l_mkl_10.3.2.137_intel64/./pset/32e/../chklic/32e/chklic/32e/chklic" -f"MKernL" -f"MKern" -p"i86_r" -p"i86_re" -p"it64_lr" -p"it64_re" -p"amd64_re" -c"/share/apps/intel/ict/Compiler/11.1/072/licenses"
-bash: /home/xxu2/src/mfd/MFD3/MFD/Make/test/l_mkl_10.3.2.137_intel64/./pset/32e/../chklic/32e/chklic/32e/chklic: Not a directory
[xxu2@dlxlogin2 ~]$
Regards,
Xin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am wondering whether there are more info about this bug which is related to the internal threading. My main concern is whether the same threading bug resides in the other subroutines such as matrix multipy, qr, lu and matrix inverse. My work relies on these libraries and I do observe strange result when I use SVD with one thread and more threads for others.
I would like to know whether I should avoid using the treaded library for now or what the known safe number of threads to use is for the MKL library.
Any information would be appreciated!
Regards,
Xin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the info. I will try it as soon as I get one installed on our cluster. It could have been much easier if I can install an evaluation version in my own directory. Unfortunately, the installation issue I brought up in the last few messages have not been solved.
Regards
Xin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Unfortunately, we are still trying to figure out the root cause of the activation problem.
Did you try to use this workarond:
As a temporary workaround you could try to backup and then cleanup folders /opt/intel/licenses and $HOME/intel/licenses (please ask for a root assistance if you have no enough permissions) and restart the installation.
If it is also unsuccessful please try following steps:
1) Go to
2) Invoke command: #> rpm -ivh --nodeps --ignorearch --prefix "location for installation" *.rpm
I'm monitoring this topic, so pleasecontact meif you have any questions.
Thank you,
- Nikolay
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your reply!
I went to check the folder /opt/intel/... But there is no /intel/ folder under /opt. The folder $HOME/intel/licenses is empty.
I tried the second way by invoke the command 'rpm ...'. I got the following message:
error: can't create transaction lock on /var/lib/rpm/__db.000
I guess it is the permission issue. I am sending your sugesstions to our system administrator.
Thanks!
Xin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
MKL10.3 Update 3 produced correct results for the small test case that I posted here. I will run some more cases with it. Hope it works good!
Thanks very much!
Xin
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page