- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
With IFORT 12.0.2, MKL 10.3.2, I started adapting one of the examples (../MKL/examples/solverf/source/ex_nlsqp_f.f), at first just adding a WRITE statement to monitor the first few variables. I changed the function subroutine as follows:
The same problem occurs with the Windows versions, but the error message is worded slightly differently:
[fxfortran] SUBROUTINE EXTENDET_POWELL (M, N, X, F)Compiling the program and running it with either the 32-bit or 64-bit compiler on SUSE 11.3 as follows
IMPLICIT NONE
INTEGER M, N
DOUBLE PRECISION X (*), F (*)
INTEGER I, ICNT
DATA ICNT/0/
DO I = 1, N/4
F (4*I-3) = X(4*I - 3) + 10.D0 * X(4*I - 2)
F (4*I-2) = 2.2360679774998D0*(X(4*I-1) - X(4*I))
F (4*I-1) = (X(4*I-2) - 2.D0*X(4*I-1))**2
F (4*I) = 3.1622776601684D0*(X(4*I-3) - X(4*I))**2
END DO
ICNT=ICNT+1
write(*,10)ICNT,X(1),X(2),X(3),X(4)
10 format(1x,i3,' x = ',4F10.5)
END SUBROUTINE EXTENDET_POWELL
[/fxfortran]
[bash]$ ifort -traceback -mkl ex_nlsqp_f.fproduced an unexpected abort after 42 calls to the subroutine, with the message
$ ./a.out
[/bash]
[bash]forrtl: severe (40): recursive I/O operation, unit -1, file unknownNote that line-236 is the WRITE statement. Changing the unit from '*' to a number such as 37 gives that unit number in the abort message.
Image PC Routine Line Source
a.out 000000000047865A Unknown Unknown Unknown
a.out 00000000004771D5 Unknown Unknown Unknown
a.out 0000000000443B86 Unknown Unknown Unknown
a.out 0000000000429A15 Unknown Unknown Unknown
a.out 000000000040A8D3 Unknown Unknown Unknown
a.out 0000000000404634 extendet_powell_ 236 ex_nlsqp_f.f
libmkl_intel_thre 00007F0EC8288423 Unknown Unknown Unknown
[/bash]
The same problem occurs with the Windows versions, but the error message is worded slightly differently:
[bash]forrtl: severe (152): unresolved contention for Intel Fortran RTL global resourceOnce again, the error is at the line with the WRITE statement.
Image PC Routine Line Source
ex_nlsqp_f.exe 0044A71A Unknown Unknown Unknown
ex_nlsqp_f.exe 00410EBA Unknown Unknown Unknown
ex_nlsqp_f.exe 00407F2F Unknown Unknown Unknown
ex_nlsqp_f.exe 0040182B _EXTENDET_POWELL 232 ex_nlsqp_f.f
ex_nlsqp_f.exe 00450CE0 Unknown Unknown Unknown
libiomp5md.dll 100621F5 Unknown Unknown Unknown
libiomp5md.dll 10046BDA Unknown Unknown Unknown
libiomp5md.dll 100446C3 Unknown Unknown Unknown
libiomp5md.dll 100632C8 Unknown Unknown Unknown
kernel32.dll 77073677 Unknown Unknown Unknown
ntdll.dll 77EA9F02 Unknown Unknown Unknown
ntdll.dll 77EA9ED5 Unknown Unknown Unknown
[/bash]
Link Copied
7 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
if you comment all mkl's function, would be the same results?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
"if you comment all mkl's function, would be the same results?"
Not at all, since this is an example program that does nothing much if the calls to MKL routines (and the invocations of functions DTR_NLSP_xxxx) are commented out. Since this particular solver routine is called with a Reverse Call Interface, commenting out the calls to MKL routines would also cause no calls to be made to the EXTENDET_POWELL routine, and the program would essentially do nothing.
I do not think that the WRITE statement in the EXTENDET_POWELL is causing I/O problems of the type that one sees when a DLL written in a language other than Fortran calles a Fortran routine, when the runtimes of the two languages can interact in odd ways.
Thanks
ADDED 9.05 AM PDT:
The problem goes away if the environmental variable OMP_NUM_THREADS is set to 1..
Not at all, since this is an example program that does nothing much if the calls to MKL routines (and the invocations of functions DTR_NLSP_xxxx) are commented out. Since this particular solver routine is called with a Reverse Call Interface, commenting out the calls to MKL routines would also cause no calls to be made to the EXTENDET_POWELL routine, and the program would essentially do nothing.
I do not think that the WRITE statement in the EXTENDET_POWELL is causing I/O problems of the type that one sees when a DLL written in a language other than Fortran calles a Fortran routine, when the runtimes of the two languages can interact in odd ways.
Thanks
ADDED 9.05 AM PDT:
The problem goes away if the environmental variable OMP_NUM_THREADS is set to 1..
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We encounter a problem with the same diagnostics here.
The last post was more of a workaround than a solution to the observed behavior. Do you know if there has been found a root cause?
Thanks in advance for any comments
Dirk van Meeuwen
The last post was more of a workaround than a solution to the observed behavior. Do you know if there has been found a root cause?
Thanks in advance for any comments
Dirk van Meeuwen
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Another possible workaround, with less of a performance penalty, is to use the /Qopenmp (Windows) or -fopenmp (Linux/Mac) option.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi mecej4,
Just an idea: function EXTENDET_POWELL calls from parallel region in case when openmp doesnt disable. So it seems that you global variable ICNT change and return on screen by different threads. I am not sure that this is a case of problem, I have not reproduced this issue yet, but it could be.
With best regards,
Alexander Kalinkin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You are probably correct!
When I first ran into this problem, it never occurred to me that threading issues could cause problems.
Now I see that multiple threads could update the variable ICNT and produce unpredictable values of that variable. However, is it not a bug that, instead of incorrect values being printed, a run-time abort occurs?
When I first ran into this problem, it never occurred to me that threading issues could cause problems.
Now I see that multiple threads could update the variable ICNT and produce unpredictable values of that variable. However, is it not a bug that, instead of incorrect values being printed, a run-time abort occurs?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
As I wrote I didn't reproduced your issue so can't say something special about it. But when you removedvariable ICNT problem with abort of RCI solver have disappeared or not? If yes when problem was in incorrect use of global variable if not - then we will try to find it somewhere else :)
With best regards,
Alexander Kalinkin
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page