- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi everyone,
We have some legacy code in F77, and there are many math function, like matrix and/or vector multiplication, copy vectors, initialization of vector and matrix. All of these F77 code are optimized (like unrolling).
From the optimization report, I can see these functions are inlined and operations are all VECTORIZED (estimated potential speedup about: 1.6).
However, if I replace these F77 function call by F90 code,
for example (a matrix multiply a vector here)
c(:) = matmul(a(:,:),b(:)).
I can save about 50% time for these matrix and vector operation.
Does this mean I still have overhead even these functions are inlined?
Could anyone give me some explanationand suggestion about how to optimize these code? Thank you in advance!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Don't know. You can find out by running VTune on the Release version (with debug symbols).
You should be able to see MKL references (assuming the compute load in MKL is sufficient enough to get sampled by VTune).
Bottom-Up should be able to show the call stack.
Jim Dempsey
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Inlining saves the function call overhead inclusive of argument saving on stack and/or registers with its potential for saving/restoring register on stack. For function such as matrix multiply you will be comparing the implementation of your F77/F90 code against the code called by the newer compiler (principally Intel's MKL). For other than small matrices, MKL will likely be much faster than anything you can write.
By the way, the MKL call is not inlined.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your reply, Jim.
matmul use the MKL.
Does following code (vectors multiplication) also calculated by using the MKL?
c(i) = sum((a(:,i) * b(:)))
Thanks,
GZ
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Don't know. You can find out by running VTune on the Release version (with debug symbols).
You should be able to see MKL references (assuming the compute load in MKL is sufficient enough to get sampled by VTune).
Bottom-Up should be able to show the call stack.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Besides what Jim said, you could use nm to see whether you have linked MKL. For most purposes, sum(a*b) should be equivalent to dotprod(a,b) but it's not obvious what might be the requirements for automatic MKL substitution. I think writing MATMUL explicitly and using the opt_matmul option of ifort (included in -O3) (gfortran has an equivalent) would be best since you have access to change source.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
TimP,
The linking dependency of MKL only indicates MKL is linked into the application. This does not indicate if
c(i) = sum((a(:,i) * b(:)))
calls MKL.
VTune is one way to get this information (as indicated in #4), setting a Debug break at statement (which may be difficult with full optimizations), and then using the Disassembly window is another way.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As far as I know, the only thing the compiler calls into MKL on its own for is MATMUL (when certain optimizations are enabled.) But I'll admit that my knowledge here is a bit stale.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you all for detailed information!
GZ
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page