- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have realized that my program (which spends about 80% of its time in MKL/Pardiso) takes 1min 50s to run with the last version of the compiler/MKL. Installing back the version of compiler/MKL which was released in august and compiling it with the same options, it takes 1min 6s !
Do you have a hint on what to check to understand what is going wrong?
Franois
Link Copied
5 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
which spends about 80% of its time in MKL/Pardiso
Did you measure the percentage with both versions? Can you definitely state whether it is the time spent inside MKL, the time in your own code, or both, that has/have increased?
Did you measure the percentage with both versions? Can you definitely state whether it is the time spent inside MKL, the time in your own code, or both, that has/have increased?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No I did not check it on both versions. I will check on that and come back to you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I think I am getting closer to the problem. It seems to be an optimization problem. I have moved to a different computer, and here are the numbers with the new computer:
Withcomposerxe-2011.4.19:
Total wall time: 82s
Wall time spent in Pardiso: 65.61s
With composer_xe_2011_sp1.9.293:
Total wall time: 400s
Wall time spent in Pardiso: 62.32s
We have Intel vTune Amplifier that I am just beginning to use. Withcomposer_xe_2011_sp1.9.293, I can clearly see that there is an Hot Spot for the call
pw = theCellState%pw(vk)
where it seems that the program loses all its time. In a nutshell, theCellState is an object that represents an array of properties for each cell of our grid. Internally, this object contains a real array for every property of the cell, and %pw(vk) is just a getter.
I suspect an inlining problem, but I can't find a way to be sure of that. Is there an "inline report" option in the compiler?
Franois
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
-opt-report includes inlining report. If you need to see more detail of where a performance problem occurs, it may be useful to run VTune comparison of builds with inlining and without (e.g. -fno-inline-functions).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
It is definitly not an inling problem. When I ask the compiler not to inline with version 2011.4.191, I get a total wall time of 92s, still far better than the 400s ofsp1.92.293 version.
I don't know what do do. The hotspot with sp1.92.293 version of the code is inside an OpenMP do loop. And if you look at the assembly code, it is full of "movq" instructions (sometimes, more than 30 in a row) which does not appear in the code compiled with2011.4.191.
Any suggestion on what to try?
Franois
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page