- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a problem with VERY bad performance of dgemm. My actualcomputationregards matrices of size65*65, but the problem seems to be present for any matrix dimension above, roughly, 15. With Intel Fortran Compiler 9.0 dgemm is roughly 5 times slower than it should be. E.g. it takes 10 times longer to multiply two symmetric 65*65 matrices using dgemm than using dyr2k (which admittedly uses the symmetry of the matrices). These results where obtained using the release version of the compiler output. dgemm is also slower than
I link to the relevant "imsl" routines (because I use other routines than dgemm from the IMSL libraries) using the following statement in the code:
"INCLUDE 'link_f90_static.h"
This in turn means that I use the following libs: imsl.lib, imslscalar.lib, imslblas.lib, imsls_err.lib.
This performance problem occurs on all computers I have tried, single or multi-core. All of themare IA-32 are running Windows XP.
Any suggestions on how to get dgemm to perform matrix multiplications faster than this would be highly appreciated. Clearly I must be doing something very wrong (in terms of linking, optimizer setting or such), because I have a very hard time believing that the Intel Fortran compiler implementation of dgemm can be this bad.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The best implementation of DGEMM would be in Intel MKL, included with Intel Visual Fortran Compiler Professional Edition 10.1. With 9.0 you'd have to buy MKL separately.
The IMSL implementation of BLAS in IMSL 5 (which you have) is not well optimized - in IMSL 6 (IVF 10.0/10.1) it uses MKL and is much better.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Steve,
Sorry, I was a bit unclear. I mentioned that I use the IMSL files: imsl.lib, imslscalar.lib, imslblas.lib, imsls_err.lib. These were put in the directory "ProgramVNICTT6.0libIA32", most probably by the installation program.
If Iinstead ONLY link to the MKL files mkl_ia32.lib mkl_c.lib and libguide.lib (by writing "!dec$objcomment lib:..." in the source file, slightly inspired by one of your earlier posts in this forum) I do get a high performance dgemm routine. I just noticed this. It makes be happy. But, by remaining problem is then how to:
i) use functions available in IMSL but not in MKL ( dlftds dlinrt dlfdds dtrmm dtrmv )
AND, in the same program,
ii) use the high performance MKL routines for DGEMM, dsyr2k and dsymm
I tried to do this by putting both an "INCLUDE link_f90_static.h" statement and the above mentioned "!dec$ojbcomment..." statement in my source file, but this does not generate speed, although it works, so I assume that it simply disregards the MKL libs.
Regards,
Karl Walentin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, you get IMSL as part of the compiler product, but it is not the compiler that is generating the bad code.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page