- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a DLL which appears to work well when compiled as a debug version.
When I switch to a release version the DLL fails immediately with an access violation - where, I don't know, as the debugger breaks out to assembler. If I tick that full debug information be included then the DLL runs (line numbers only and I still get a failure through access violation). If I then switch from 'Maximize speed' to 'Maximize speed plush igher optimisations' I get the access violation again. If I select 'Use Intel Processor extensions' the compiler aborts - I don't know if this is triggered by the development machine using an AMD Sempron.
When I timed the two options of a debug DLL and a release DLL (with full debug information, but also set to maximize speed' the debug DLL varies between running as fast as the release version, and running faster. Timing might have been affected by other Windows processes, but the debug version was never slower than the release.
Any suggestions as to what might be possible causes? I can continue to provide the DLL compiled using the debug mode, when all seems to be fine. The DLL integrates functions in user-supplied DLLs, and the results have been checked against Matlab equivalents for three user DLLs so far. The user interface is written in Visual Basic 6, and array out of bound accesses within the DLL result in the VB code reporting either an overflow or divide by zero upon returning from the DLL - a problem earlier - which is not currently happening. I do use allocatable arrays, which I deallocate on first use, but with the STAT keyword to provide a 'safe' error return, and the SAVE attribute to ensure persistence between DLL calls.
The only nonstandard thing of which I am aware is the use of Cray pointers to access subroutines written in user-provided DLLs. I'm using IVF 9.1, if that matters.
When I switch to a release version the DLL fails immediately with an access violation - where, I don't know, as the debugger breaks out to assembler. If I tick that full debug information be included then the DLL runs (line numbers only and I still get a failure through access violation). If I then switch from 'Maximize speed' to 'Maximize speed plush igher optimisations' I get the access violation again. If I select 'Use Intel Processor extensions' the compiler aborts - I don't know if this is triggered by the development machine using an AMD Sempron.
When I timed the two options of a debug DLL and a release DLL (with full debug information, but also set to maximize speed' the debug DLL varies between running as fast as the release version, and running faster. Timing might have been affected by other Windows processes, but the debug version was never slower than the release.
Any suggestions as to what might be possible causes? I can continue to provide the DLL compiled using the debug mode, when all seems to be fine. The DLL integrates functions in user-supplied DLLs, and the results have been checked against Matlab equivalents for three user DLLs so far. The user interface is written in Visual Basic 6, and array out of bound accesses within the DLL result in the VB code reporting either an overflow or divide by zero upon returning from the DLL - a problem earlier - which is not currently happening. I do use allocatable arrays, which I deallocate on first use, but with the STAT keyword to provide a 'safe' error return, and the SAVE attribute to ensure persistence between DLL calls.
The only nonstandard thing of which I am aware is the use of Cray pointers to access subroutines written in user-provided DLLs. I'm using IVF 9.1, if that matters.
Link Copied
6 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We've had to drop the /Qsafe-cray-ptr option, in case you used that. It makes unnecessarily aggressive assumptions. It doesn't mean "accept Cray pointers." In fact, if your debug build gives good performance, you may never want to go to more aggressive options than /O1 /fp:source. If you are using /Qsave, instead of correcting source code, it may be dangerous to optimize. The 9.1 compiler would not use SSE code for AMD, unless you set /QxW to make a single code path. Multiple code paths may not be good in your case. Beyond this, there are too many possibilities to speculate on. You may want trial runs with /check, in case the compiler can diagnose any problems.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you.
I did use /Qsave, but not /Qsafe-cray-pointers (although, in practice, the Cray pointers are safe). I removed /Qsave, and put the SAVE attribute on the additional variables that expect persistence - Fortran 77-era integrators.
It didn't change the need to include full debugging information, and restrict to 'Maximize speed'.
I'll live with using a debug build, then.
I did use /Qsave, but not /Qsafe-cray-pointers (although, in practice, the Cray pointers are safe). I removed /Qsave, and put the SAVE attribute on the additional variables that expect persistence - Fortran 77-era integrators.
It didn't change the need to include full debugging information, and restrict to 'Maximize speed'.
I'll live with using a debug build, then.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Compiling with /O1 the following code results in the compiler just hanging. I can't see what is here that would cause such a problem. (Insert code in this forum editor doesn't support Fortran!)
[plain]subroutine Equations(t, ny, y, dy, tStart, tFinish, & NumberOfModels, NumberOfProcesses, ModelList, & ProcessList, NumberofStreams, Streams, & nx, x, nResults, Results, nOps, Operation, & nModelData, ModelData, ProcessOrder) USE CoreTypes USE ProcessID implicit none double precision :: tStart, tFinish integer :: NumberOfModels integer :: NumberofProcesses integer :: nx, nResults, nOps, nModelData integer :: NumberofStreams integer :: ProcessOrder(NumberofProcesses) double precision :: x(nx), Results(nResults) double precision :: Operation(nOps) double precision :: ModelData(nModelData) type (ProcessModel) :: ModelList(NumberOfModels) type (Process) :: ProcessList(NumberOfProcesses) type (Stream) :: Streams(NumberofStreams) integer:: ny double precision:: t, y(ny), dy(ny) dy = 0d0 call Loop(t, y, ny) call Eqn(t, y, dy, ny) return entry LoopsOnly(t, ny, y, tStart, tFinish, & NumberOfModels, NumberOfProcesses, ModelList, & ProcessList, NumberofStreams, Streams, & nx, x, nResults, Results, nOps, Operation, & nModelData, ModelData, ProcessOrder) call Loop(t, y, ny) return CONTAINS subroutine Eqn(t, y, dy, ny) implicit none include 'interface.f90' ! -- Non-standard CRAY pointer pointer (pe, locDLLSubDY) integer:: ny double precision:: t, y(ny), dy(ny) integer:: i, j, k, L integer:: ns, iy, iys, ix, ixs integer:: ir, irs, io, ios, im do i = 1, NumberOfProcesses k = ProcessOrder(i) j = ProcessList(k)%ModelIndex if (j .gt. 0) then do L = 1, NumberOfModels if (ModelList(L)%ModelID .eq. j) then j = L exit end if end do end if NS = ProcessList(k)%Stages iy = ProcessList(k)%y iys = ProcessList(k)%yStage ix = ProcessList(k)%x ixs = ProcessList(k)%xStage ir = ProcessList(k)%Results irs = ProcessList(k)%StageResults io = ProcessList(k)%Operation ios = ProcessList(k)%StageOperation im = ProcessList(k)%ModelData select case (j) case (INFLUENT_ID) case (CV_ID) case (MIX2_ID, MIX3_ID) case (SPLIT2_ID, SPLIT3_ID) case default pe = ModelList(j)%ModelPointerDiff call locDLLSubDY(t, NS, ProcessList(k), Streams, & y(iy), y(iys), x(ix), x(ixs), & dy(iy), dy(iys), Results(ir), Results(irs), & Operation(io), Operation(ios), ModelData(im)) end select end do end subroutine Eqn subroutine Loop(t, y, ny) use LoopData include 'interface.f90' ! -- Non-standard CRAY pointer pointer (pl, locDLLSubAssign) integer:: ny double precision:: t, y(ny) logical :: Converged double precision:: OldFlow(NumberOfStreams) integer :: count integer:: i, j, k, L, iOut, out, in integer:: ns, iy, iys, ix, ixs integer:: ir, irs, io, ios, im !DEC$ ATTRIBUTES ALIAS: '_NumberOfDeterminands':: NumberOfDeterminands integer, external:: NumberOfDeterminands integer :: nDet nDet = NumberOfDeterminands() count = 0 converged = .false. do OldFlow = Streams(:)%Flow do i = 1, NumberOfProcesses k = ProcessOrder(i) j = ProcessList(k)%ModelIndex if (j .gt. 0) then do L = 1, NumberOfModels if (ModelList(L)%ModelID .eq. j) then j = L exit end if end do end if NS = ProcessList(k)%Stages iy = ProcessList(k)%y iys = ProcessList(k)%yStage ix = ProcessList(k)%x ixs = ProcessList(k)%xStage ir = ProcessList(k)%Results irs = ProcessList(k)%StageResults io = ProcessList(k)%Operation ios = ProcessList(k)%StageOperation im = ProcessList(k)%ModelData ! ! Set all outlet COMPOSITIONS to INLET values ... ! if (ProcessList(k)%InStream(1) .gt. 0) then do iOut = 1, MAX_OUTLETS out = ProcessList(k)%OutStream(iOut) if (out .gt. 0) then in = ProcessList(k)%InStream(1) Streams(out)%t = Streams(in)%t Streams(out)%pH = Streams(in)%pH Streams(out)%Value(1:nDet) = Streams(in)%Value(1:nDet) end if end do end if select case (j) case (INFLUENT_ID) call Influent(t, ProcessList(k), Streams, ModelData(im), Operation(io)) case (CV_ID) call CV(t, ProcessList(k), Streams, ModelData(im), Operation(io)) case (MIX2_ID) call Mix(ProcessList(k), Streams, 2) case (MIX3_ID) call Mix(ProcessList(k), Streams, 3) case (SPLIT2_ID) call Split(ProcessList(k), Streams, Operation(io), 2) case (SPLIT3_ID) call Split(ProcessList(k), Streams, Operation(io), 3) case default pl = ModelList(j)%ModelPointerAlloc call locDLLSubAssign(t, NS, ProcessList(k), Streams, & y(iy), y(iys), x(ix), x(ixs), & Results(ir), Results(irs), & Operation(io), Operation(ios), ModelData(im)) end select do iOut = 1, MAX_OUTLETS out = ProcessList(k)%OutStream(iOut) if (out .gt. 0 .and. j .ne. INFLUENT_ID) then if (Streams(out)%Flow .le. 0d0) then Streams(out)%t = 0d0 Streams(out)%pH = 0d0 Streams(out)%Value(1:nDet) = 0d0 end if end if end do end do converged = all(abs(OldFlow - Streams(:)%Flow) .le. fLoopTol * abs(Streams(:)%Flow)) count = count + 1 if (converged) exit if (.not. bLoop) exit if (count .gt. iMaxLoop) exit end do end subroutine Loop end subroutine Equations [/plain]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I can see how this might be a compiler buster. The idea of requiring SAVE in combination with ENTRY has always been doubtful, and now you combine it with CONTAINS in a way which is likely not to have been tested during compiler QA. As you didn't supply your include file, we can't try it from your post. You could submit a problem report on premier.intel.com.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks.
Making the CONTAINed routines separate (allowing me to get rid of the ENTRY) results in that routine compiling with optimisations and NOT hanging.
The DLL no longer crashes with access violation if I use the /O1 setting, which is a big improvement!
And I get a 33 - 50% improvement in the runtime for my test case. (33% using a Runge-Kutta integrator; 50% using an implicit Runge-Kutta integrator.) Still falls over with /O2, despite getting rid of /Qsave - but this has been a huge improvement.
I'm curious as to what properties of a code may typically prevent /O2 working.
Making the CONTAINed routines separate (allowing me to get rid of the ENTRY) results in that routine compiling with optimisations and NOT hanging.
The DLL no longer crashes with access violation if I use the /O1 setting, which is a big improvement!
And I get a 33 - 50% improvement in the runtime for my test case. (33% using a Runge-Kutta integrator; 50% using an implicit Runge-Kutta integrator.) Still falls over with /O2, despite getting rid of /Qsave - but this has been a huge improvement.
I'm curious as to what properties of a code may typically prevent /O2 working.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - dudley@wrcplc.co.uk
I'm curious as to what properties of a code may typically prevent /O2 working.
It could also be a compiler bug.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page