- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've just converted from Absoft to Intel as my FORTRAN compiler. The only thing I don't like is that it takes several minutes to re-compile. Specifically, the compiling stage goes very quickly, but after the Output says "Linking...", it takes several minutes. I'm using the MKL library for FFTs and two small libraries that I wrote, one in FORTRAN, the other in C. Does anyone know why the linking takes so long and if there's a way to speed up the link process? The Absoft compiler took just a few seconds for the same task. Thanks.
Link Copied
- « Previous
-
- 1
- 2
- Next »
36 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Do I know what the problem is? No. Do I know where it is? I have a better idea.
Some explanation - xilink is the Intel "prelinker" that is primarily used when you have asked for "whole program interprocedural optimization" (/Qipo). It is looking for Intel intermediate language in objects, combining them and then running the optimizer. After doing that, it calls the linker.
Even though you are not using IPO, xilink still needs to read all of the objects and libraries to see if there is IP code in there, and I suspect it is this that is causing the slowness. Was there some particular object or library which, when you added it to the list, caused the slowdown? I don't remember offhand.
Some explanation - xilink is the Intel "prelinker" that is primarily used when you have asked for "whole program interprocedural optimization" (/Qipo). It is looking for Intel intermediate language in objects, combining them and then running the optimizer. After doing that, it calls the linker.
Even though you are not using IPO, xilink still needs to read all of the objects and libraries to see if there is IP code in there, and I suspect it is this that is causing the slowness. Was there some particular object or library which, when you added it to the list, caused the slowdown? I don't remember offhand.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Steve Lionel (Intel)
Do I know what the problem is? No. Do I know where it is? I have a better idea.
Some explanation - xilink is the Intel "prelinker" that is primarily used when you have asked for "whole program interprocedural optimization" (/Qipo). It is looking for Intel intermediate language in objects, combining them and then running the optimizer. After doing that, it calls the linker.
Even though you are not using IPO, xilink still needs to read all of the objects and libraries to see if there is IP code in there, and I suspect it is this that is causing the slowness. Was there some particular object or library which, when you added it to the list, caused the slowdown? I don't remember offhand.
Some explanation - xilink is the Intel "prelinker" that is primarily used when you have asked for "whole program interprocedural optimization" (/Qipo). It is looking for Intel intermediate language in objects, combining them and then running the optimizer. After doing that, it calls the linker.
Even though you are not using IPO, xilink still needs to read all of the objects and libraries to see if there is IP code in there, and I suspect it is this that is causing the slowness. Was there some particular object or library which, when you added it to the list, caused the slowdown? I don't remember offhand.
BTW, I'll be out of town for a week, so I won't be able to do any more detective work after today.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok. When you do find the culprit, if you can attach a ZIP of it here that would be helpful.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Steve Lionel (Intel)
Ok. When you do find the culprit, if you can attach a ZIP of it here that would be helpful.
The culprit seems to be in mkl_dfti, specifically DftiCreateDescriptor. Here is my code, with some comment lines saying what happens to me when I do certain things.
Evan
program Console1
implicit none
integer :: nfft=16
real :: pser(16)
complex :: spec1(9)
call r2c_fwd_fft(pser,4,spec1)
! If above call is commented out, the program links quickly. If present, linking takes several minutes
print *, 'Hello World'
end program Console1
subroutine r2c_fwd_fft(rts,nord,cspec)
use MKL_DFTI
! If above line is commented out, I get:
! Error 1 error #6457: This derived type name has not been declared. &
! [DFTI_DESCRIPTOR] E:EvanifortConsole1Console1.f90 48
! I copied mkl_dfti.f90 from "D:Program FilesIntelCompiler11.148mklinclude" to the source directory for this
! project
! If I don't have the Fortran => Libraries => Use Intel Math Kernel Library set to "Sequential...", then I get:
! Error 1 error LNK2019: unresolved external symbol _dfti_create_descriptor_1d &
! referenced in function _R2C_FWD_FFT. Console1.obj
implicit none
real :: rts(:)
integer :: nord,nfft
complex :: cspec(:)
type(DFTI_DESCRIPTOR), POINTER :: My_Desc1_Handle
Integer :: Status(6)=0
nfft=2**nord
Status(1) = DftiCreateDescriptor(My_Desc1_Handle, DFTI_SINGLE, DFTI_REAL, 1, nfft)
! This is the line that causes the linker to be very slow. Without the above line, but with
! the following 5 lines, the linker is fast.
!Status(2) = DftiSetValue( My_Desc1_Handle, DFTI_PLACEMENT, DFTI_NOT_INPLACE)
!Status(3) = DftiSetValue(My_Desc1_Handle,DFTI_CONJUGATE_EVEN_STORAGE,DFTI_COMPLEX_COMPLEX) ! Default, optional
!Status(4) = DftiCommitDescriptor(My_Desc1_Handle)
!Status(5) = DftiComputeForward(My_Desc1_Handle, rts(1:nfft), cspec(1:nfft/2+1))
!Status(6) = DftiFreeDescriptor(My_Desc1_Handle)
if (sum(Status) /= 0) then
print*,'r2c_fwd_fft Warning: sum(Status(1:6)) /= 0 from MKL'
print*,' Status(1:6) = ',Status(1:6)
endif
return
end subroutine
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks - I can see the behavior you describe and will try to figure out what is going wrong.
A comment - your test program is incorrect because an explicit interface to r2c_fwd_fft needs to be visible to the caller. However, it does allow the problem to be seen - I imagine you cut this down from a larger program.
A comment - your test program is incorrect because an explicit interface to r2c_fwd_fft needs to be visible to the caller. However, it does allow the problem to be seen - I imagine you cut this down from a larger program.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Steve Lionel (Intel)
Thanks - I can see the behavior you describe and will try to figure out what is going wrong.
A comment - your test program is incorrect because an explicit interface to r2c_fwd_fft needs to be visible to the caller. However, it does allow the problem to be seen - I imagine you cut this down from a larger program.
A comment - your test program is incorrect because an explicit interface to r2c_fwd_fft needs to be visible to the caller. However, it does allow the problem to be seen - I imagine you cut this down from a larger program.
Hope you find a solution. It seems like this would have come up for others.
About your second comment, I read up on explicit interface. Would making the r2c_fwd_fft subroutine part of a module make the code correct? When you said "incorrect", did you mean it wouldn't work or that it was not good coding practice? I'm not as familiar with creating interfaces as with using modules, so I could use a reference to a tutorial on that. Thanks.
Also, I read about the flag /check:arg_temp_created that you referenced in one of your articles on argument passing. Is this available as a Fortran option? I couldn't find it under Properties for the project. If it's not there, how do you put it into the compile statement? When I started converting to the use of modules, I started to get compiler complaints about passing an array starting location, like x(1,1,jsub,jfreq) into a subroutine that was expecting an array. I've "fixed" it by changing the actual argument to x(:,:,jsub,jfreq), which compiles and works. But I'm hoping it is not causing a copy of the data to be made.
Evan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
By incorrect I mean that the program won't execute properly because the compiler does not know thast it needs to pass a descriptor for the assumed-shape array. Yes, putting the routine in a module would be a fine solution.
/check:arg_temp_created is under the "Runtime" property page - warn when array argument uses temporary storage, or something like that.
It would be ok to pass an array element to an array unless the array you are passing is a POINTER array. If it is ALLOCATABLE, that's ok.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Steve Lionel (Intel)
By incorrect I mean that the program won't execute properly because the compiler does not know thast it needs to pass a descriptor for the assumed-shape array. Yes, putting the routine in a module would be a fine solution.
/check:arg_temp_created is under the "Runtime" property page - warn when array argument uses temporary storage, or something like that.
It would be ok to pass an array element to an array unless the array you are passing is a POINTER array. If it is ALLOCATABLE, that's ok.
I hope you haven't given up on this. I'm still suffering through waiting two minutes every time I recompile. Thanks.
Evan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I haven't given up but I did need to put it aside. What I did find is that the "slow" version pulls in a lot more code than the "fast" version of your code. (I didn't see any difference as to whether your contained function was called or not.) What's odd is that trying to reproduce this in a smaller test case doesn't show a difference, so I'm not yet sure what specifically is the trigger.
I'll eventually need to involve the MKL folks, so you may want to skip the middleman and report this in the MKL section - they may be able to help you sooner.
I'll eventually need to involve the MKL folks, so you may want to skip the middleman and report this in the MKL section - they may be able to help you sooner.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Steve Lionel (Intel)
I haven't given up but I did need to put it aside. What I did find is that the "slow" version pulls in a lot more code than the "fast" version of your code. (I didn't see any difference as to whether your contained function was called or not.) What's odd is that trying to reproduce this in a smaller test case doesn't show a difference, so I'm not yet sure what specifically is the trigger.
I'll eventually need to involve the MKL folks, so you may want to skip the middleman and report this in the MKL section - they may be able to help you sooner.
I'll eventually need to involve the MKL folks, so you may want to skip the middleman and report this in the MKL section - they may be able to help you sooner.
I owe you an apology and a beer.
Back when it was suggested I turn off IPO, I said that I checked it and that it was already off. Well, apparently, I found the IPO setting under the Fortran tab and verified that it was off. But after starting the topic in the MKL Forum and getting the same advice, I checked again and found that the Linker IPO setting had indeed been set to Yes. I guess I checked the IPO flag back when I wasn't as familiar with all the tabs and sub-tabs for the Project Properties.
Sorry to have wasted your time on that. I'm curious, though, were you able to recreate my 2-minute linking problem? If so, what was causing yours? I assume you had your IPO set to No.
Evan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In my project, IPO is not set. I can reproduce the long link time (on my system it's more like 30 seconds, but with the "fast" code the linking is just a second or two.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Steve Lionel (Intel)
In my project, IPO is not set. I can reproduce the long link time (on my system it's more like 30 seconds, but with the "fast" code the linking is just a second or two.
I too would find this useful, particularly when performing a binary search for a bug in a single .o file among thousands. Don't try it on a laptop disk.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Steve Lionel (Intel)
In my project, IPO is not set. I can reproduce the long link time (on my system it's more like 30 seconds, but with the "fast" code the linking is just a second or two.
By "fast code" do you mean the code with the call to the DFTI routine commented out?
My original "slow" code that had the problem now takes just a second or two.
Evan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Slow with the call to DftiCreateDescriptor, Fast with the calls to the other five Dfti routines in its place. For reasons not yet clear to me, calling DftiCreateDescriptor pulls in a lot more stuff.
Tim is correct that xilink is always invoked, whether or not IPO is enabled. It is xilink that is taking so long, the MS linker is not as sluggish. Unfortunately, there's no option you can set (that I know of) to skip the use of xilink.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Steve,
I'm having the same problem. I use intel visual composer xe 2013. I used the same trick that you mentioned but it didn't work. I have to keep the original xilink.exe so that the linking works. Any solutions for this long time linking (up to 4 minutes).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you can provide us with a test case we'll be glad to investigate.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- « Previous
-
- 1
- 2
- Next »