- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have encountered a situation where a READ statement functions differently depending on the intention of theuser to operate in single or multi-thread mode under OMP.
The Code proceeds:
Initialization - opens and readscontrol and parameter files, including a variable specifying how many threads to run with in the parallel portion of the program.
Data Input - opens and reads the main data file. The binary file contains some 110,000 records of 150ish fields each. The data is used to populate SECTIONMEM, a derived-type array stored in memory, not on disk.
Thread setup - calls OMP_GET_NUM_PROCS to determine how many CPUs are available (8). Uses this to limit the number of processors specified in the OMP_SET_NUM_THREADS call.
Processing loop - outside loop through time periods, inside loop through each element of SECTIONMEM. The inside loop is parallelized. This loop calls over a dozen subroutines, utilizes a number of ThreadPrivate common blocks, has system-shared data in public common blocks, and performs some 80% of the processing.
- -
So a single executable is used for either single or multi-thread execution, as determined in the Thread setup portion of the code. When the input variable is set to "1" (single thread) the program processes and completes normally. When I specify multi-threaded operation (the input variable = 4), the READ statement in the Data input sequence malfunctions by reading NaN in the fifth input field. When single threaded operation is specified, the same executable reads a valid floating point value from the same disk file.
I had thought that the target field for the READ stmt might be the culprit, as it resides in a ThreadPrivate common block (as do the other input fields, but this one is a recent addition). The declarations all seemed in order, so I changed the target field to a variable local to the Data Input subroutine. Perhaps I erred in naming the target field for my cat, who is patently not-a-number: the multi-threaded run still reads NaN while the single threaded run reads nice numbers.
And the Read error occurs before any OMP-related activity occurs (other than including the OMP library and reading in the input variable). And only to this one field.
Any ideas on what to check next? Has anyone heard of this type of problem before?
The Code proceeds:
Initialization - opens and readscontrol and parameter files, including a variable specifying how many threads to run with in the parallel portion of the program.
Data Input - opens and reads the main data file. The binary file contains some 110,000 records of 150ish fields each. The data is used to populate SECTIONMEM, a derived-type array stored in memory, not on disk.
Thread setup - calls OMP_GET_NUM_PROCS to determine how many CPUs are available (8). Uses this to limit the number of processors specified in the OMP_SET_NUM_THREADS call.
Processing loop - outside loop through time periods, inside loop through each element of SECTIONMEM. The inside loop is parallelized. This loop calls over a dozen subroutines, utilizes a number of ThreadPrivate common blocks, has system-shared data in public common blocks, and performs some 80% of the processing.
- -
So a single executable is used for either single or multi-thread execution, as determined in the Thread setup portion of the code. When the input variable is set to "1" (single thread) the program processes and completes normally. When I specify multi-threaded operation (the input variable = 4), the READ statement in the Data input sequence malfunctions by reading NaN in the fifth input field. When single threaded operation is specified, the same executable reads a valid floating point value from the same disk file.
I had thought that the target field for the READ stmt might be the culprit, as it resides in a ThreadPrivate common block (as do the other input fields, but this one is a recent addition). The declarations all seemed in order, so I changed the target field to a variable local to the Data Input subroutine. Perhaps I erred in naming the target field for my cat, who is patently not-a-number: the multi-threaded run still reads NaN while the single threaded run reads nice numbers.
And the Read error occurs before any OMP-related activity occurs (other than including the OMP library and reading in the input variable). And only to this one field.
Any ideas on what to check next? Has anyone heard of this type of problem before?
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As you read the data into either a ThreadPrivate common block .OR. variable local to the Data Input subroutine, do you copy the pertinant data visible to all threads? Why not read directly into the shared data area?
Does anything bugger up the I/O Unit number? Or close data file and reopen different file on same unit?
Jim Dempsey

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page