- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Since i dont have imsl5 in intel fortran, i compiled lapack as math library. I use intel and CVF in same computer, and later i check out the speed of BLAS1 and the result is intel Fortran 8 is slower than CVF6.6 even in 50% MIPS.
In CVF i use optimize:3 (optimize4 or beyond goes to error in verification [too much optimizing ???] )
in IF8 i cant use /Qipo because it failed in linking from Lapack library builded to console exe, why?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We can't read your mind, as to which CPU you are using, which options, which compiler versions, which tests you are using to compare.
Generally, the emphasis in ifort 8.0 is on optimization for P4 and later (/QxW, /QxN, /QxP). The loops of interest to you should vectorize. With recent versions of 8.0, which allocate arrays on 16-byte boundaries when you permit it, this should give good performance for loop lengths beyond 50 or so.
ipo isn't very relevant for BLAS, unless, possibly,you are testing with short arrays and want to in-line the BLAS functions into your test driver. In that case, there would have been no reason to use CVF /optimize:5, as that would optimize for longer loops.
I don't think anyone would buy IMSL for BLAS, with so many public BLAS versions available, and MKL optimized for IA processors.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What a mean-mouthed comment, so typical of tcprince! Nobody is impressed, you don't help anyone, least of all yourself, but whose's counting?.
It looks like IMSL 5 gives you all of MKL, so don't waste anything, especially money, on the latter.
HTH,
Gerry T.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thats why i try to compile LAPACK library. I compiled library using CVF and IFORT for testing the speed to.
My processor was an AMD, but i will go for a bench on my office comp that has P4.
For CVF i using same "make" as i download from netlibs but i edit option build: df -optimize:2 as df -optimize:5 and processor option is Pentium3 for building a static library.
Since LAPACK gave a demo testing and bench by creating the exe files, i build the exe and ran the test. The testing showed that i have error on verification.
I did compiling static library and demo exe again by using optimize:4 and optimize:3, the result is optimize:4 still gave error, and optimize:3 give no error.
For IFORT i change the make "make" with ifort /QaxW /Qx /Qipo
resulting error when reasembly lib to create exe file
so i change to ifort /QaxW /Qx
It was succesfull and i try to bench the exe bench resulting CVF compiled is faster almost 50%.
Question :
For CVF; To much optimize will kill you ?
For IFORT 8.0, Why the speed difference is quite big?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
/Qx should be /Ox
CVF is 6.5 and 6.6B
IFORT vers is 8 standard
additional option is /architecture:pn3 /tune:pn3 for both build options CVF and IFORT.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
"We can't read your mind, as to which CPU you are using, which options, which compiler versions, which tests you are using to compare."
gft to Steve Lionel re tcprince :
Is this being a representative of Intel?, it's not clear, and in any event, Intel ought not to condone his overt contempt for forum users, a behavior you tacitely endorse. His contribution to the forum is of questionable
value and his nonparticipation wouldn't be missed.
--
Gerry T.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have tried ifort /G6 /Qprefetch resulting DGEMM 1,4 Gflop DGEV 0,5 Gflop. So no improvement.
then i use GOTO BLAS Pentium3 resulting DGEMM 1,6 Gflop DGEV 1,4 Gflop,
comparing to CVF resultingDGEMM 1,37 Gflop DGEV 1,2 Gflop.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page