- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a DLL which appears to work well when compiled as a debug version.
When I switch to a release version the DLL fails immediately with an access violation - where, I don't know, as the debugger breaks out to assembler. If I tick that full debug information be included then the DLL runs (line numbers only and I still get a failure through access violation). If I then switch from 'Maximize speed' to 'Maximize speed plush igher optimisations' I get the access violation again. If I select 'Use Intel Processor extensions' the compiler aborts - I don't know if this is triggered by the development machine using an AMD Sempron.
When I timed the two options of a debug DLL and a release DLL (with full debug information, but also set to maximize speed' the debug DLL varies between running as fast as the release version, and running faster. Timing might have been affected by other Windows processes, but the debug version was never slower than the release.
Any suggestions as to what might be possible causes? I can continue to provide the DLL compiled using the debug mode, when all seems to be fine. The DLL integrates functions in user-supplied DLLs, and the results have been checked against Matlab equivalents for three user DLLs so far. The user interface is written in Visual Basic 6, and array out of bound accesses within the DLL result in the VB code reporting either an overflow or divide by zero upon returning from the DLL - a problem earlier - which is not currently happening. I do use allocatable arrays, which I deallocate on first use, but with the STAT keyword to provide a 'safe' error return, and the SAVE attribute to ensure persistence between DLL calls.
The only nonstandard thing of which I am aware is the use of Cray pointers to access subroutines written in user-provided DLLs. I'm using IVF 9.1, if that matters.
When I switch to a release version the DLL fails immediately with an access violation - where, I don't know, as the debugger breaks out to assembler. If I tick that full debug information be included then the DLL runs (line numbers only and I still get a failure through access violation). If I then switch from 'Maximize speed' to 'Maximize speed plush igher optimisations' I get the access violation again. If I select 'Use Intel Processor extensions' the compiler aborts - I don't know if this is triggered by the development machine using an AMD Sempron.
When I timed the two options of a debug DLL and a release DLL (with full debug information, but also set to maximize speed' the debug DLL varies between running as fast as the release version, and running faster. Timing might have been affected by other Windows processes, but the debug version was never slower than the release.
Any suggestions as to what might be possible causes? I can continue to provide the DLL compiled using the debug mode, when all seems to be fine. The DLL integrates functions in user-supplied DLLs, and the results have been checked against Matlab equivalents for three user DLLs so far. The user interface is written in Visual Basic 6, and array out of bound accesses within the DLL result in the VB code reporting either an overflow or divide by zero upon returning from the DLL - a problem earlier - which is not currently happening. I do use allocatable arrays, which I deallocate on first use, but with the STAT keyword to provide a 'safe' error return, and the SAVE attribute to ensure persistence between DLL calls.
The only nonstandard thing of which I am aware is the use of Cray pointers to access subroutines written in user-provided DLLs. I'm using IVF 9.1, if that matters.
Link Copied
6 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We've had to drop the /Qsafe-cray-ptr option, in case you used that. It makes unnecessarily aggressive assumptions. It doesn't mean "accept Cray pointers." In fact, if your debug build gives good performance, you may never want to go to more aggressive options than /O1 /fp:source. If you are using /Qsave, instead of correcting source code, it may be dangerous to optimize. The 9.1 compiler would not use SSE code for AMD, unless you set /QxW to make a single code path. Multiple code paths may not be good in your case. Beyond this, there are too many possibilities to speculate on. You may want trial runs with /check, in case the compiler can diagnose any problems.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you.
I did use /Qsave, but not /Qsafe-cray-pointers (although, in practice, the Cray pointers are safe). I removed /Qsave, and put the SAVE attribute on the additional variables that expect persistence - Fortran 77-era integrators.
It didn't change the need to include full debugging information, and restrict to 'Maximize speed'.
I'll live with using a debug build, then.
I did use /Qsave, but not /Qsafe-cray-pointers (although, in practice, the Cray pointers are safe). I removed /Qsave, and put the SAVE attribute on the additional variables that expect persistence - Fortran 77-era integrators.
It didn't change the need to include full debugging information, and restrict to 'Maximize speed'.
I'll live with using a debug build, then.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Compiling with /O1 the following code results in the compiler just hanging. I can't see what is here that would cause such a problem. (Insert code in this forum editor doesn't support Fortran!)
[plain]subroutine Equations(t, ny, y, dy, tStart, tFinish, &
NumberOfModels, NumberOfProcesses, ModelList, &
ProcessList, NumberofStreams, Streams, &
nx, x, nResults, Results, nOps, Operation, &
nModelData, ModelData, ProcessOrder)
USE CoreTypes
USE ProcessID
implicit none
double precision :: tStart, tFinish
integer :: NumberOfModels
integer :: NumberofProcesses
integer :: nx, nResults, nOps, nModelData
integer :: NumberofStreams
integer :: ProcessOrder(NumberofProcesses)
double precision :: x(nx), Results(nResults)
double precision :: Operation(nOps)
double precision :: ModelData(nModelData)
type (ProcessModel) :: ModelList(NumberOfModels)
type (Process) :: ProcessList(NumberOfProcesses)
type (Stream) :: Streams(NumberofStreams)
integer:: ny
double precision:: t, y(ny), dy(ny)
dy = 0d0
call Loop(t, y, ny)
call Eqn(t, y, dy, ny)
return
entry LoopsOnly(t, ny, y, tStart, tFinish, &
NumberOfModels, NumberOfProcesses, ModelList, &
ProcessList, NumberofStreams, Streams, &
nx, x, nResults, Results, nOps, Operation, &
nModelData, ModelData, ProcessOrder)
call Loop(t, y, ny)
return
CONTAINS
subroutine Eqn(t, y, dy, ny)
implicit none
include 'interface.f90'
! -- Non-standard CRAY pointer
pointer (pe, locDLLSubDY)
integer:: ny
double precision:: t, y(ny), dy(ny)
integer:: i, j, k, L
integer:: ns, iy, iys, ix, ixs
integer:: ir, irs, io, ios, im
do i = 1, NumberOfProcesses
k = ProcessOrder(i)
j = ProcessList(k)%ModelIndex
if (j .gt. 0) then
do L = 1, NumberOfModels
if (ModelList(L)%ModelID .eq. j) then
j = L
exit
end if
end do
end if
NS = ProcessList(k)%Stages
iy = ProcessList(k)%y
iys = ProcessList(k)%yStage
ix = ProcessList(k)%x
ixs = ProcessList(k)%xStage
ir = ProcessList(k)%Results
irs = ProcessList(k)%StageResults
io = ProcessList(k)%Operation
ios = ProcessList(k)%StageOperation
im = ProcessList(k)%ModelData
select case (j)
case (INFLUENT_ID)
case (CV_ID)
case (MIX2_ID, MIX3_ID)
case (SPLIT2_ID, SPLIT3_ID)
case default
pe = ModelList(j)%ModelPointerDiff
call locDLLSubDY(t, NS, ProcessList(k), Streams, &
y(iy), y(iys), x(ix), x(ixs), &
dy(iy), dy(iys), Results(ir), Results(irs), &
Operation(io), Operation(ios), ModelData(im))
end select
end do
end subroutine Eqn
subroutine Loop(t, y, ny)
use LoopData
include 'interface.f90'
! -- Non-standard CRAY pointer
pointer (pl, locDLLSubAssign)
integer:: ny
double precision:: t, y(ny)
logical :: Converged
double precision:: OldFlow(NumberOfStreams)
integer :: count
integer:: i, j, k, L, iOut, out, in
integer:: ns, iy, iys, ix, ixs
integer:: ir, irs, io, ios, im
!DEC$ ATTRIBUTES ALIAS: '_NumberOfDeterminands':: NumberOfDeterminands
integer, external:: NumberOfDeterminands
integer :: nDet
nDet = NumberOfDeterminands()
count = 0
converged = .false.
do
OldFlow = Streams(:)%Flow
do i = 1, NumberOfProcesses
k = ProcessOrder(i)
j = ProcessList(k)%ModelIndex
if (j .gt. 0) then
do L = 1, NumberOfModels
if (ModelList(L)%ModelID .eq. j) then
j = L
exit
end if
end do
end if
NS = ProcessList(k)%Stages
iy = ProcessList(k)%y
iys = ProcessList(k)%yStage
ix = ProcessList(k)%x
ixs = ProcessList(k)%xStage
ir = ProcessList(k)%Results
irs = ProcessList(k)%StageResults
io = ProcessList(k)%Operation
ios = ProcessList(k)%StageOperation
im = ProcessList(k)%ModelData
!
! Set all outlet COMPOSITIONS to INLET values ...
!
if (ProcessList(k)%InStream(1) .gt. 0) then
do iOut = 1, MAX_OUTLETS
out = ProcessList(k)%OutStream(iOut)
if (out .gt. 0) then
in = ProcessList(k)%InStream(1)
Streams(out)%t = Streams(in)%t
Streams(out)%pH = Streams(in)%pH
Streams(out)%Value(1:nDet) = Streams(in)%Value(1:nDet)
end if
end do
end if
select case (j)
case (INFLUENT_ID)
call Influent(t, ProcessList(k), Streams, ModelData(im), Operation(io))
case (CV_ID)
call CV(t, ProcessList(k), Streams, ModelData(im), Operation(io))
case (MIX2_ID)
call Mix(ProcessList(k), Streams, 2)
case (MIX3_ID)
call Mix(ProcessList(k), Streams, 3)
case (SPLIT2_ID)
call Split(ProcessList(k), Streams, Operation(io), 2)
case (SPLIT3_ID)
call Split(ProcessList(k), Streams, Operation(io), 3)
case default
pl = ModelList(j)%ModelPointerAlloc
call locDLLSubAssign(t, NS, ProcessList(k), Streams, &
y(iy), y(iys), x(ix), x(ixs), &
Results(ir), Results(irs), &
Operation(io), Operation(ios), ModelData(im))
end select
do iOut = 1, MAX_OUTLETS
out = ProcessList(k)%OutStream(iOut)
if (out .gt. 0 .and. j .ne. INFLUENT_ID) then
if (Streams(out)%Flow .le. 0d0) then
Streams(out)%t = 0d0
Streams(out)%pH = 0d0
Streams(out)%Value(1:nDet) = 0d0
end if
end if
end do
end do
converged = all(abs(OldFlow - Streams(:)%Flow) .le. fLoopTol * abs(Streams(:)%Flow))
count = count + 1
if (converged) exit
if (.not. bLoop) exit
if (count .gt. iMaxLoop) exit
end do
end subroutine Loop
end subroutine Equations
[/plain]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I can see how this might be a compiler buster. The idea of requiring SAVE in combination with ENTRY has always been doubtful, and now you combine it with CONTAINS in a way which is likely not to have been tested during compiler QA. As you didn't supply your include file, we can't try it from your post. You could submit a problem report on premier.intel.com.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks.
Making the CONTAINed routines separate (allowing me to get rid of the ENTRY) results in that routine compiling with optimisations and NOT hanging.
The DLL no longer crashes with access violation if I use the /O1 setting, which is a big improvement!
And I get a 33 - 50% improvement in the runtime for my test case. (33% using a Runge-Kutta integrator; 50% using an implicit Runge-Kutta integrator.) Still falls over with /O2, despite getting rid of /Qsave - but this has been a huge improvement.
I'm curious as to what properties of a code may typically prevent /O2 working.
Making the CONTAINed routines separate (allowing me to get rid of the ENTRY) results in that routine compiling with optimisations and NOT hanging.
The DLL no longer crashes with access violation if I use the /O1 setting, which is a big improvement!
And I get a 33 - 50% improvement in the runtime for my test case. (33% using a Runge-Kutta integrator; 50% using an implicit Runge-Kutta integrator.) Still falls over with /O2, despite getting rid of /Qsave - but this has been a huge improvement.
I'm curious as to what properties of a code may typically prevent /O2 working.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - dudley@wrcplc.co.uk
I'm curious as to what properties of a code may typically prevent /O2 working.
It could also be a compiler bug.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page