- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have written a code in which it is needed to multiply 4 vectors of length 2000 with 4 square matrixes of the same length then add the results up together. Finally repeat the whole procedure 31 times (as can be seen through the code below). The problem is, it takes more than 20 seconds to do these simple matrix calculations while I expected it to be much faster. Does anyone know why is it so slow or how can I do it faster?
REAL*8, ALLOCATABLE, DIMENSION(:,:) :: dummy, dummy2
REAL*8, ALLOCATABLE, DIMENSION(:,:,:) :: TPLTZ
integer Nt, NN, ic
Nt = 2000
NN = 31
ALLOCATE( dummy(4,Nt) )
dummy = 0d0
ALLOCATE( dummy2(NN,Nt) )
dummy2 = 0d0
ALLOCATE( TPLTZ(Nt,Nt,4) )
TPLTZ = 0d0
do ic=1,NN
dummy2(ic,:) = matmul(dummy(1,:),TPLTZ(:,:,1))+matmul(dummy(2,:),TPLTZ(:,:,2))+matmul(dummy(3,:),TPLTZ(:,:,3))+matmul(dummy(4,:),TPLTZ(:,:,4))
end do- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are you running this code with the default project settings?
If so you are using debugging options that slow the execution.
My computer shows 4 x speedup with your code when I switch to Release configuration.
Build: Configuration Manager: Active solution configuration: Release.
You can also add a configuation dropdown list to the toolbar.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You could do two things:
Reverse the index order of dummy and dummy2 so that the mutiplication uses vectors of stride one.
Use the MKL library routines to do the multplication. DGEMM I think.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You could do two things:
Reverse the index order of dummy and dummy2 so that the mutiplication uses vectors of stride one.
Use the MKL library routines to do the multplication. DGEMM I think.
DGEMV should improve speed by avoiding some of the temporary vector allocations for each intermediate result.
Are you asking for the compiler to shortcut the meaningless operations? Perhaps it might do some of that at -O3.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are you running this code with the default project settings?
If so you are using debugging options that slow the execution.
My computer shows 4 x speedup with your code when I switch to Release configuration.
Build: Configuration Manager: Active solution configuration: Release.
You can also add a configuation dropdown list to the toolbar.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You are right. I switched to the Release configuration as you said and now it is by the same factor faster. Thanks, It was amazing. BTW I get a warning message right before running as below
________________________________________________________________________________________
No debugging information
Debugging information for 'code_name.exe' can not be found or does not match. Binary was not build with debug information.
Do you want to continue debugging?
________________________________________________________________________________________
I just hit yes to get rid of it though I am not really aware of the penalty! Could you please tell me if it is OK to do so? or if I should do something to be sure that I will not get some wrong results due to debugging problems etc.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When you hit F5 it means start to debug the exe. Release exe cannot be debugged since it is designed for speed and has no debug info.
Start your exe using Ctrl F5 and it wont complain!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Or you can set full optimizations in Debug build. Debugging is difficult but you can still do some debugging.
Or you can leave full debugging on,in the project browse tree select just the file with your matrix multiplication, right-click, properties, full optimizations, build. Now only this module is full speed.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page