Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Access violation in multithreaded program

onkelhotte
New Contributor II
3,809 Views

Hello there,

I have a problem with a multithreading program. In that program, I use some subroutines and functions all over the place (like converting a string to an integer, displaying a real number in an edit box...)

Sometimes I get access violations, I think because that the function or subroutine foo is being called by two different threads at the same time.

Does anybody had the same problem or has a hint, what I could do? Could it be that, because the argument is a reference, that in a second call of foo the reference of the first foo is altered?

Thanks in advance,
Markus

0 Kudos
13 Replies
TimP
Honored Contributor III
3,809 Views
A likely cause of such problems comes with failure to declare a procedure RECURSIVE. If that declaration is not used, attention is required to be certain that /Qauto or options which imply it are set. You might check whether SAVE is used in an incompatible way. If variables are undefined, /Qauto might expose a failure even with single thread. Another possibility might be buggy use of ENTRY.
0 Kudos
jimdempseyatthecove
Honored Contributor III
3,809 Views
Marcus, >>Sometimes I get access violations, I think because that the function or subroutine foo is being called by two different threads at the same time. This situation could cause data corruption but not an access violation. Access violations occur when a program references (virtual) memory that is not mapped to the appliction. Most often the reference is that of an uninitialized pointer/reference or attempted reuse of a pointer/reference than has gone out of scope .AND. the memory location holding the pointer/reference was subsequently modified to contain what looks like a pointer/reference to an address that is not mapped. Is your FORTRAN program calling C/C++ code where the argument(s) is(are) assumed to be NULL terminated? If you do, and if the FORTRAN string does not contain a NULL, then nasty things will happen. In the access violation report you usually see the location of the instruction causing the fault, and the location the data it was attempting to access. You may also get an opportunity to Debug. When a fault occurs, write down the two locations, then try to Debug. If you enter the debugger but receive "No source available..." then look at the call stack. Hopefully you can find the nearest level with source, set focus to that level (double click on line in call stack). Then try to look at the statement(s) at/preceeding the return address of the call. Something may show up funny. If nothing is obvious then add some assert statements for the arguments (e.g. test to assure LOC(arg) is reasonable, and NULL terminated args are in fact NULL terminated. Jim Dempsey
0 Kudos
onkelhotte
New Contributor II
3,809 Views
Thanks for the suggestions. There are no RECURSIVE functions or subroutines, no SAVE and no ENTRY, it is not a mixed language project and I didn´t use /Qauto, the project setting was for default local storage. I just had an access violation and I made a screenshot of the debugger. The write statement is beeing called very often and nothing happens. But sometimes it creates an access violation. So I think something else in my project goes wrong and it shows in that line. Also I did notice that this happens, when I let my project run in the background and begin to do something else (like posting here). There are a few lines, where an access violation happens, and it is always a read or write statement. Markus Edit: Now I had this error message: First-chance exception at 0x006d23e8 in BCT Monitoring Tool.exe: 0xC0000005: Access violation reading location 0xfeeefefa. HEAP[BCT Monitoring Tool.exe]: HEAP: Free Heap block 70f56d8 modified at 70f5e38 after it was freed Windows has triggered a breakpoint in BCT Monitoring Tool.exe. This may be due to a corruption of the heap, which indicates a bug in BCT Monitoring Tool.exe or any of the DLLs it has loaded. This may also be due to the user pressing F12 while BCT Monitoring Tool.exe has focus. The output window may have more diagnostic information.
0 Kudos
TimP
Honored Contributor III
3,809 Views
write will have a single buffer for each open file, so if multiple threads write to the same file, there is a race condition. In fact, if one thread fills the buffer and it begins to flush to disk, it seems very bad things can happen if another thread uses the buffer. So, if you are using a low level threading method, you would put writes in a critical region. auto-parallel or OpenMP might accomplish this automatically.
0 Kudos
onkelhotte
New Contributor II
3,809 Views
Tim, the read and write statements are on a variable, not a file: subroutine setSequenceUpdateDrives(SequenceNumber) use globaleVariablen use iflogm implicit none integer(kind=2) SequenceNumber logical(kind=4) l character*255 text include 'resource_online.fd' write(text,'(i)') SequenceNumber l=dlgSet(dlgTabPort20102, IDC_EDIT_StatusPort20102, trim(adjustl(text))) return end subroutine setSequenceUpdateDrives The access violation occurs in the line "write(text,'(i)') SequenceNumber".
0 Kudos
TimP
Honored Contributor III
3,809 Views
Still you have a race condition on the text variable, since you didn't declare RECURSIVE or set /Qauto, and you would depend on the implementation of internal write being thread safe even if you take reasonable precautions. If you are setting a shared variable here, you would need to establish atomic access to it anyway.
0 Kudos
ZlamalJakub
New Contributor III
3,809 Views
Do You compile and link your sources with Multithread "runtime libraries" (/threads)? use RECURSIVE or set /Qauto as it was in previous responses.
0 Kudos
onkelhotte
New Contributor II
3,809 Views
Thanks again for all the suggestions. I got a little bit further, but I couldn´t eliminate the access violation, although it got "better" which means I ran my program and the routine got called over eleventhousand times before the access violation occurred. I set the /Qauto flag and /threads too, it was only set to Debug QuickWin (/libs:qwin /dbglibs). I declared my subroutines as recursive and set /recursive in the project properties. Basicly, my program receives data via WinSock every 2 seconds. There are 4 WinSock Threads listening to different ports. The data are being sent from a .NET program. In a WinSock Thread I do this (always with different structures and different subroutines for displaying): ! got data in EA_Telegramm_816 struct write(port,'(a,i6,x,i4,a,5(i2,a),i4)',iostat=iError) 'Message received, #', EA_Telegramm_816%DBX240, EA_Telegramm_816%DBX280,".",EA_Telegramm_816%DBX300,".",EA_Telegramm_816%DBX320,"-",EA_Telegramm_816%DBX340,":",EA_Telegramm_816%DBX360,":",EA_Telegramm_816%DBX380,".",EA_Telegramm_816%DBX400 flush(port) call setSequenceUpdateComm(EA_Telegramm_816%DBX240) ! waiting again recursive subroutine setSequenceUpdateComm(SequenceNumber) use globaleVariablen use iflogm implicit none integer(kind=2) SequenceNumber logical(kind=4) l character*255 textComm include 'resource_online.fd' write(textComm,'(i)') SequenceNumber l=dlgSet(dlgTabPort20100, IDC_EDIT_StatusPort20100, trim(adjustl(textComm))) return end subroutine setSequenceUpdateComm I´m logging some data, port is a text file. Then I want to display a counter of my dialog. Putting the flush statement helped. Removing the write statement helps very much. But there are still access violations in the subroutines setSequenceUpdate_xxx and sometimes in other subroutines or functions, where I use a write or read statement. Another suggestion by TimP was to do some OpenMP stuff, which I haven´t used before. I tried !$OMP ATOMIC WRITE write(textDrives,'(i)') SequenceNumber l=dlgSet(dlgTabPort20102, IDC_EDIT_StatusPort20102, trim(adjustl(textDrives))) but this doesn´t help. Am I doing this atomic right or do I have to do something else or different? Thanks in advance, Markus
0 Kudos
jimdempseyatthecove
Honored Contributor III
3,809 Views
Markus, The description of your experience leads me to suspect that a call to a system function or C/C++ routine is passing a reference verses value or address of pointer/discriptor verses that pointed to/array addres described by discriptor .AND. such incorrect argument usage is functional for the call however it corrupts something elsewhere in your code (inside the library function for write/read). This can happen quite easily if you do not use the provided interface modules (or if there is an error in one of the interface declaratins in said module). IOW should you write your own interface, or use none (FORTRAN default calling parameters), then you run the risk of getting something wrong. Example, what is the interface for dlgSet(dlgTabPort20100, IDC_EDIT_StatusPort20100, trim(adjustl(textComm)))? Is it expecting an ASCIIZ string pointer? Jim Dempsey
0 Kudos
onkelhotte
New Contributor II
3,809 Views
Hi Jim, dlgSet is a QuickWin function, the arguments are okay (integer, integer, character). But I use system functions to create the threads... I made a mistake with the CreateThread function, this would explain it. I´m running a test overnight now. I´ll tell you tomorrow what happened. Thanks, Markus
0 Kudos
onkelhotte
New Contributor II
3,809 Views
Getting the CreateThread right helped a lot, but I still have to remove the write statement that logs into a text file. Tonight the program ran for 14 hours without crashing. Here is how I implemented it now: [fortran] subroutine startThread integer(INT_PTR_KIND()) :: threadID_PortComm integer(INT_PTR_KIND()) ThreadHandle_PortComm integer(INT_PTR_KIND()), PARAMETER :: securityComm = 0 integer(INT_PTR_KIND()), PARAMETER :: stack_sizeComm = 0 integer(kind=2) portComm interface integer(kind=4) function WinsockComm(port) !DEC$ ATTRIBUTES STDCALL, ALIAS:"_winsockcomm" :: WinsockComm integer(kind=2), pointer :: port end function end interface !... ThreadHandle_PortComm = CreateThread(securityComm, stack_sizeComm, WinsockComm, loc(portComm), CREATE_SUSPENDED, ThreadID_PortComm) i = SetThreadPriority(ThreadHandle_PortComm, THREAD_PRIORITY_BELOW_NORMAL) i = ResumeThread(ThreadHandle_PortComm) end subroutine integer(kind=4) function WinsockComm(port) !DEC$ ATTRIBUTES STDCALL, ALIAS:"_winsockcomm" :: WinsockComm integer(kind=2), pointer :: port ! ... ! still have to exclude this write statement !write(port,'(a,i6,x,i4,a,5(i2,a),i4)',iostat=iError) 'Message received, #', EA_Telegramm_816%DBX240, EA_Telegramm_816%DBX280,".",EA_Telegramm_816%DBX300,".",EA_Telegramm_816%DBX320,"-",EA_Telegramm_816%DBX340,":",EA_Telegramm_816%DBX360,":",EA_Telegramm_816%DBX380,".",EA_Telegramm_816%DBX400 write(textComm,'(i)') EA_Telegramm_816%DBX240 l=dlgSet(dlgTabPort20100, IDC_EDIT_StatusPort20100, trim(adjustl(textComm))) ! ... end function WinsockComm [/fortran] Markus
0 Kudos
jimdempseyatthecove
Honored Contributor III
3,809 Views
Does your WinsockComm thread perform blocking I/O (e.g. read that waits for data), or does it perform polling I/O (compute loop waiting to see arrival of data). It would be better to perform the blocking I/O (read socket with wait for data or timeout (timeout large)). Then in this case you would not set the priority low, you could make it above normal since it will be waiting almost all of the time. Jim Dempsey
0 Kudos
onkelhotte
New Contributor II
3,809 Views
I´m such a fool... The problem is (or better was) that I accessed the QuickWin Dialog from another thread. Putting the dlgSet(...) out of the Winsock Thread into the Main Thread solved my issue. Markus
0 Kudos
Reply