- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello all!
I have been using Intel Fortran for Linux version 7 until now. I now used the compiler version 8 to compile the very same source code on both Linux and Windows and the program takes about 2.5 times longer to complete a calculation.
This seems a bit odd to me. I wonder if anyone else has observed a similar behaviour and/or an idea why this occurs?
I have been using Intel Fortran for Linux version 7 until now. I now used the compiler version 8 to compile the very same source code on both Linux and Windows and the program takes about 2.5 times longer to complete a calculation.
This seems a bit odd to me. I wonder if anyone else has observed a similar behaviour and/or an idea why this occurs?
Message Edited by baumeier@nwz.uni-muenster.de on 06-19-2004 01:11 AM
Link Copied
9 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2.5 times longer? That seems rather unusual. Our tests show version 8 to be, on average, about 10% better than version 7. Of course, individual programs will vary. What switches are you using?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Steve!
I simply use -O3 in all cases.
We have tested the various compiler versions with other programs and doing that we also saw that the programs run about 10-20% faster with the compiler version 8 - just as one would expect.
I wonder if there are maybe specific procedures that are known to be taking longer than in the previous version. My program does not contain anything "spectecular" - mainly just some allocatable complex matrices that are being diagonalized.
I simply use -O3 in all cases.
We have tested the various compiler versions with other programs and doing that we also saw that the programs run about 10-20% faster with the compiler version 8 - just as one would expect.
I wonder if there are maybe specific procedures that are known to be taking longer than in the previous version. My program does not contain anything "spectecular" - mainly just some allocatable complex matrices that are being diagonalized.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try /assume:buffered_io and see if it makes a difference. That's the one thing I can think of which could hit you. If your program doesn't do a lot of unformatted I/O, you can skip that.
You might also try /QxN if you're on a P4.
Oh, thought of one more thing. If you're running close to the edge on physical memory, you may be finding that 8.0 uses somewhat more virtual memory than 7.x did and you could be swapping. That can lead to significant increases in run times.
Message Edited by sblionel on 06-19-2004 11:23 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi!
I have tried that but there was no significant improvement. As the runtime seems to increase in a loop that uses some LAPACK routines I tried to recompile those but also with no effect.
Anyway, thanks for your effort.
I have tried that but there was no significant improvement. As the runtime seems to increase in a loop that uses some LAPACK routines I tried to recompile those but also with no effect.
Anyway, thanks for your effort.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Me again...
I have run a profile of the program and the result is interesting.
About 55% of the runtime the program is in a routine called "cexp.J".
I'm not too certain where this comes from because it is not there in the profile for the Intel7 version.
I have run a profile of the program and the result is interesting.
About 55% of the runtime the program is in a routine called "cexp.J".
I'm not too certain where this comes from because it is not there in the profile for the Intel7 version.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It looks to me like cexp.J is scalardouble precision complex exponential for SSE2. I would guess that it may be doing a more careful job with extreme arguments (modulus of imaginary part> PI ?)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Well, I have profiled another program that uses complex exponential functions massively and the result is really interesting.
Here "exp.J" is also at the top of the list with almost 80% of the runtime, whereas in version 7 "z_exp" is listed with about 12%. However, the runtime itself is about 15% shorter with the compiler version 8.
So it almost looks like the increase of time spent in cexp.J is normal and is normally more than compensated by improvements in other areas. Something which doesn't seem to be the case with my program.
Here "exp.J" is also at the top of the list with almost 80% of the runtime, whereas in version 7 "z_exp" is listed with about 12%. However, the runtime itself is about 15% shorter with the compiler version 8.
So it almost looks like the increase of time spent in cexp.J is normal and is normally more than compensated by improvements in other areas. Something which doesn't seem to be the case with my program.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OK, I think I found the problem.
Comparing my program with the other program I talked about above, I realized that the arguments of the exponential function I use is purely imaginary (exp(i*arg)) and not complex.
I removed the call to exp and replaced it with DCMPLX(COS(arg),SIN(arg)) and the program now runs faster that the Intel 7 version. The same changes do not seem to have any effect when compiling with Intel 7.
So it seems cexp.J has problems when the argument of the exponential function is purely imaginary.
Comparing my program with the other program I talked about above, I realized that the arguments of the exponential function I use is purely imaginary (exp(i*arg)) and not complex.
I removed the call to exp and replaced it with DCMPLX(COS(arg),SIN(arg)) and the program now runs faster that the Intel 7 version. The same changes do not seem to have any effect when compiling with Intel 7.
So it seems cexp.J has problems when the argument of the exponential function is purely imaginary.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please write this up and send to Intel Premier Support, with a sample program. I'm sure this will be of interest to our math library developers.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page