Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Shared object run time error when built with OpenMP

Greg_T_
Valued Contributor I
1,780 Views

I have a Fortran shared object (*.so) that has a run time error when a routine in the shared object is called from the GUI; the shared object is built using the -qopenmp option.  Without OpenMP directives in the Fortran source code or compile options in the make file, the serial version of the Fortran shared object works correctly.  The run time error also occurs when I compile and link with the -qopenmp-stubs option.

The routines in the "big" Fortran shared object are called from a C# GUI run under Mono on Linux, which has been working until I added OpenMP directives to one subroutine.  The OpenMP directives are in a different routine that VTune identified as a hot spot; the routine being called does not have OpenMP directives.  The same C# GUI and Fortran source code build a *.dll with OpenMP and run correctly on Windows.

A small test program with a C# GUI successfully calls a routine in a small Fortran shared object that uses OpenMP directives and built with the -qopenmp and -liomp5 options, so it appears this approach will work.  

One difference between the "big" shared object and the small test program is that the big shared object is linked to other Fortran shared object files that are not compiled with the OpenMP option.  The test program has just one Fortran shared object.

The "big" Fortran shared object that has the run time error just gives the shared object file name when the C# exception is caught, so I didn't get a helpful message or clue there.  The Log4Net logging output from C# is written just before the call to the Fortran routine.  The logging output file I write at the very beginning of the Fortran routine is not opened, so it looks to me like the problem occurs at the call from C# to Fortran (which works correctly without OpenMP).  This seems somewhat like the "dll not found error" on Windows when a run time dependency is not being found when a routine in the DLL is called.  

In the Linux make file I'm using these options to compile all of the Fortran source code files to build the "big" shared object and test program shared object.  In the big shared object just one routine has OpenMP syntax:

    -fPIC -O2 -fpconstant -warn declarations -warn unused -fpp -qopenmp

And options in the make file for the link command for both the big shared object and test program:

    -qopenmp -shared -fPIC -static-intel -L. -Wl,-rpath,. -liomp5

The variables and link command in the Linux make file to build the "big" shared object that I'll call "libMySharedObject.so":

compiler = ifort
objects = [list of the *.o files in the project...]
options = -qopenmp -shared -fPIC -static-intel -L. -Wl,-rpath,. -liomp5
libs = [list of the other *.so files...]
$(compiler) -o libMySharedObject.so $(objects) $(options) $(libs)

Q1: Could linking with other *.so files that do not use OpenMP be a source of the run time problem?
Q2: Maybe I'm missing a link option in the make file or a OpenMP library on Linux?
Q3: Perhaps the -liomp5 option is not adequate to link the libiomp5.a dependency, or conflicts with -fPIC or -static-intel options?
Q4: Is linking sensitive to the order of objects, options, and other *.so shared object libraries?  I've tried rearranging the order of the options but that hasn't corrected the run time error.
Q5: Is there other diagnostic information or logging output that I could add to get a helpful clue to the run time error?

I realize this is probably insufficient information to start with, so I can add more details.  I could also post the working test program.  Since the small test program and shared object with OpenMP works but does not reproduce the run time error, I'll work to extend the test example to see if I can get a reproducing case I could post.

On the Linux Computer: Red Hat 6.8, Intel Fortran 16.0.3.210,
I confirmed that the files libiomp5.a libiomp5.dbg libiomp5.so are in the /opt/intel/lib/intel64/ directory.

Thanks for your help and advice.
Regards,
Greg T.

0 Kudos
16 Replies
jimdempseyatthecove
Honored Contributor III
1,780 Views

All Fortran subroutines and functions (regardless of .o or .so), that contain local arrays and/or user defined types, that are called from within a parallel region must be compiled with -qopenmp .OR. other options that place local arrays on stack as opposed to implicitly SAVE (recursive, auto, ...). This requirement holds even when the called code has no OpenMP directives. Doing otherwise, makes the local array implicitly SAVE (shared). (Whereas in C/C++ these are implicitly located on stack.)

Jim Dempsey

0 Kudos
Greg_T_
Valued Contributor I
1,780 Views

Hi Jim,

Would the requirement to compile with -qopenmp extend to other subroutines that do not have a parallel region, or routines in a separate shared object?  My project will probably not have OpenMP directives in all subroutines, nor in all Fortran shared objects.  Could I compile just the subroutines using OpenMP directives with the -qopenmp option, and omit that option for other subroutines that do not use OpenMP?

I believe I've reproduced the problem with a small test program.  

The test program generates a list of random numbers as a proxy for calculations, and as a way to show that the values change when run again.  I've separated the program into 3 parts to test OpenMP options and shared objects: (1) main program, (2) first shared object to generate values, (3) second shared object to output values.  The main program gathers inputs for the number of values to generate and number of threads to use, and it calls the first subroutine in the first shared object.  After generating the random numbers in the parallel do loop, a second subroutine is called to output the values to a text file, which is in the second shared object library to test the extra dependency and the options: -qopenmp -liomp5 (see make files below).  

The program runs correctly on Linux if I build the second shared object with the -qopenmp and -liomp5 options, but the second shared object library does not use OpenMP syntax.  If I omit those options in the second shared object, I get a segmentation fault and core dump from the main Fortran driver on Linux when the driver calls the first routine in the first shared object.  

On Windows if I omit the /Qopenmp for the second DLL the program runs correctly.  It looks like there is different behavior for OpenMP in the two shared objects on Linux versus the DLLs on Windows, which is maybe not unexpected for two different OS.  I'd like to understand the difference in compiling and linking OpenMP on Windows versus Linux so that I can use the same source code to build and run on both OS.

If it helps to explain my question, here is the source code and make files for the test program.
The main driver source code, main.f90:

!
! Main program to test random number generator located in a *.dll Windows DLL
! or in a *.so shared object on Linux
!
	program main
!
! declare variables
!
	implicit none
	integer seed_flag			! 0=default random seed, 1=re-initialize random seed
	integer i					! loop index
	integer io					! file unit number
	integer dimTable,numTable	! array size, number of values in the array
	integer numThreads			! number of threads to use in the OpenMP region in the DLL routine
	integer, allocatable :: table(:)		! array to store random values
!
! initialize values
!
	io = 55		! file unit number
!
! read input:
! read the seed flag value: 0=default random seed, 1=re-initialize random seed,
! read the number of value to generate, use to allocate the array size
!
	write(*,*)'---- Begin Random Integer Program ----'
	write(*,*)
	write(*,*)'Enter the random number seed flag: 0=default, 1=re-initialize seed'
	read(*,*)seed_flag
	write(*,*)
	write(*,*)'Enter the number of random values to generate:'
	read(*,*)numTable
	write(*,*)
	write(*,*)'Enter the number of threads to use in the OpenMP region,'
	write(*,*)'typically 1 to 8 threads:'
	read(*,*)numThreads

	dimTable = numTable		! set array size equal to the number of values
!
! allocate an array to store the list of random values generated by the DLL routine
!
	allocate(table(dimTable))
	table = 0
!
!--parameters for accessing a Dynamic Link Library (DLL), should be ignored for a Linux shared object
!DEC$ATTRIBUTES DLLIMPORT :: random_integer_openmp
!DEC$ATTRIBUTES ALIAS : 'random_integer_openmp' :: random_integer_openmp
!
! call the routine in the DLL on Windows or in the shared object *.so on Linux
!
	write(*,*)
	write(*,*)'             seed flag =',seed_flag
	write(*,*)'      number of values =',numTable
	write(*,*)'	    number of threads =',numThreads
	write(*,*)
	write(*,*)'Generate random numbers'

	call random_integer_openmp(seed_flag,table,dimTable,numTable,numThreads)
!
! report a few values to the console
!
	write(*,*)
	write(*,*)'Writing values to file "random_values.out"'
	write(*,*)'    first random value =',table(1)
	write(*,*)'     last random value =',table(numTable)
!
! report the random values to a text file
!
	open(unit=io,file='random_values.out',status='replace')

	write(io,*)'Random integer values'
	write(io,*)'           seed flag =',seed_flag
	write(io,*)'    number of values =',numTable
	write(io,3)'case','value'
3	format(1x,a6,2x,a10)

	do i=1,numTable
		write(io,2)i,table(i)
2		format(1x,i6,2x,i10)
	end do

	close(unit=io)
!
! deallocate array and exit
!
	if(allocated(table))deallocate(table)

	write(*,*)'---- End Random Integer Program ----'

	end

The main driver Linux make file:

#!/bin/bash
#
# makefile to build the main program and link to the shared object library
# to run on Linux,
# expecting the Intel Fortran compiler environment variables to be initialized
#
# define some variables,
# use "\" as a continuation line
#
objects = main.o
# on Linux the main program may look for the shared object library in
# RandomNumbersOpenmp.so or libRandomNumbersOpenmp.so
shared = libRandomNumbersOpenmp.so
compiler = ifort
options = -L. -static-intel -Wl,-rpath,.
#
# the -L. option looks for shared libraries in the current directory "."
# the -Wl option passes the -rpath option to the linker
# the linker -rpath,. option looks for files in the current directory "."
#
# compiler options for Linux (similar to Windows),
# the -fPIC for "position independent code", option is needed for each subroutine
# when building the shared object,
# O2 for optimization for speed,
# fpconstant to evaluate single-precision constants as double precision,
# warn:declarations to check for undeclared names,
# warn:unused to check for declared variables that are never used
#
opt1 = -O2 -fpconstant -warn declarations -warn unused
#
# compile and link the main.run executable
#
main.run: $(objects)
	$(compiler) -o main.run $(objects) $(options) $(shared)

main.o: main.f90
	$(compiler) -c main.f90 $(opt1)

The main driver Windows make file (or build in a VS solution):

#!/bin/bash
#
# makefile to build the main program and link to the DLL on Windows,
# open the initialized Intel Parallel Studio console (CMD) window so that
# the Fortran environment variables are set.
#
# use "nmake" command on Windows: nmake -f make_file_name
#
# define some variables,
# some of the options obtained from the Visual Studio project properties,
# use "\" as a continuation line
# on Windows specify the export lib file from the DLL
#
objects = main.obj
shared = RandomNumbersOpenmp.lib
compiler = ifort
options = /nologo /threads
# compiler options for Windows (similar to Linux),
# O2 for optimization for speed,
# fpconstant to evaluate single-precision constants as double precision
# warn:declarations to check for undeclared names,
# warn:unused to check for declared variables that are never used
#
opt1 = /O2 /warn:declarations /warn:unused /fpconstant
#
# compile and link the main.exe executable program,
# use the /exe:filename option to set the name of the executable file
#
main.run: $(objects)
	$(compiler) /exe:main.exe $(objects) $(options) $(shared)

main.obj: main.f90
	$(compiler) -c main.f90 $(opt1)

The first shared object source code, random_integer_openmp.f90:

!
! Return the array of random inter values,
! set numValues to the desired number of random numbers to generate;
! set seed_flag = 1 to reinitialize the random seed, 0 keeps the current seed,
! pass in the number of threads to use in the OpenMP parallel region and loop
!
	subroutine random_integer_openmp(seed_flag,values,dimValues,numValues,numThreadsUse)
!
! Declare Variables
!
	implicit none
	integer, intent (in) :: seed_flag				! 0=default seed, 1=re-initialize the random number seed
	integer, intent (in) :: dimValues				! array size
	integer, intent (in) :: numValues				! number of values to return, numValues <= dimValues
	integer, intent (in) :: numThreadsUse			! number of threads to use in the OpenMP parallel region
	integer, intent (inout) :: values(dimValues)	! array to return random values
!
!--parameters for compiling as a Dynamic Link Library (DLL), should be ignored for a Linux shared object
!DEC$ATTRIBUTES DLLEXPORT :: random_integer_openmp
!DEC$ATTRIBUTES ALIAS : 'random_integer_openmp' :: random_integer_openmp
!--additional attributes for use with Visual Basic or C#
! xx DEC$ATTRIBUTES STDCALL :: random_integer_openmp
!DEC$ATTRIBUTES REFERENCE :: seed_flag,values,dimValues,numValues,numThreadsUse
!
! Local Variables
!
	integer debug_level,i
	integer logFlag,logUnit,mflag
	integer max_threads,num_threads
	real x
	real timeStart,timeFinish
	character(LEN=512) logFile,message
!
! declare functions,
! declare OpenMP run time functions
!
	integer OMP_GET_MAX_THREADS,OMP_GET_NUM_THREADS,OMP_GET_THREAD_NUM
	real(8) OMP_GET_WTIME
!
! initialize log file name, file unit number, and local debug flag
!
	logFlag = 10
	logFile = 'random_integer_openmp.log'
	debug_level = 10
	logUnit = 51
	mflag = 0
	message = ' '

	if(logFlag.ge.debug_level)then	! log file output
		open(unit=logUnit,file=logFile,status='unknown')
		write(logUnit,*)'----- Begin random_integer_openmp -----'
		write(logUnit,*)'                seed flag =',seed_flag
		write(logUnit,*)'               array size =',dimValues
		write(logUnit,*)'         number of values =',numValues
		write(logUnit,*)'    use number of threads =',numThreadsUse
		close(unit=logUnit)
	end if
!
! Error checking
!
	if(numValues.le.0)then
		mflag = 2
		message = 'ERROR: the number of requested trials = 0 (random_integer_openmp).'
	else if(numValues.gt.dimValues)then
		mflag = 2
		message = 'ERROR: too many requested trials; the number of requested trials > dimValues (random_integer_openmp).'
	end if
	if(mflag.ge.1 .and. logFlag.ge.1)then	! log file output
		open(unit=logUnit,file=logFile,position='append')
		write(logUnit,*)
		write(logUnit,*)TRIM(message)
		write(logUnit,*)'          mflag =',mflag
		write(logUnit,*)'      numValues =',numValues
		write(logUnit,*)'      dimValues =',dimValues
		write(logUnit,*)
		close(unit=logUnit)
	end if
	if(mflag.ge.2)goto 900		! jump to exit
!
! Initialize Values
!
	if(logFlag.ge.debug_level)then	! log file output
		open(unit=logUnit,file=logFile,position='append')
		write(logUnit,*)
		write(logUnit,*)'initialize values() array to zero'
		close(unit=logUnit)
	end if

	values = 0

	if(seed_flag.eq.1)then
		if(logFlag.ge.debug_level)then	! log file output
			open(unit=logUnit,file=logFile,position='append')
			write(logUnit,*)
			write(logUnit,*)'initialize random seed'
			close(unit=logUnit)
		end if

		call RANDOM_SEED
	end if
!
! generate the random number values and save as an integer value to the values() array
!
	timeStart = OMP_GET_WTIME()
!
! set the number of threads to use in the OpenMP parallel region below,
! check if more than max available threads
!
	max_threads = OMP_GET_MAX_THREADS()

	if(numThreadsUse.lt.max_threads)then
		call OMP_SET_NUM_THREADS(numThreadsUse)
	end if

	if(logFlag.ge.debug_level)then	! log file output
		open(unit=logUnit,file=logFile,position='append')
		write(logUnit,*)
		write(logUnit,*)'      max_threads =',max_threads
		write(logUnit,*)'    numThreadsUse =',numThreadsUse
		close(unit=logUnit)
	end if
!
! open the log file before the parallel region
!
	if(logFlag.ge.debug_level)then
		open(unit=logUnit,file=logFile,position='append')
	end if
!
! include the values() array in the OMP SHARED clause so the values are available
! after the parallel region
!
!! !$OMP PARALLEL SHARED(num_threads,max_threads,values) NUM_THREADS(numThreadsUse)
!$OMP PARALLEL SHARED(num_threads,max_threads,values)
!$OMP MASTER
	num_threads = OMP_GET_NUM_THREADS()

	if(logFlag.ge.debug_level)then	! log file opened before parallel region
		write(logUnit,*)
		write(logUnit,*)'      num_threads =',num_threads
		write(logUnit,*)
	end if
!$OMP END MASTER

!$OMP DO PRIVATE(i,x)
	do i=1,numValues
		call RANDOM_NUMBER(x)			! use random number in place of a calculation
		values(i) = nint(1000.0*x)
!
!			debugging output,
!			write value and thread ID to the log file to check multi threading
!			TODO: comment out critical section when debugging output is not needed
!
!$OMP CRITICAL
		if(logFlag.ge.debug_level)then	! log file opened before parallel region
			write(logUnit,*)'  index =',i,'  value =',values(i),'  thread ID =',OMP_GET_THREAD_NUM()
		end if
!$OMP END CRITICAL
	end do
!$OMP END DO

!$OMP MASTER
	num_threads = OMP_GET_NUM_THREADS()	! get the number of threads used

	if(logFlag.ge.debug_level)then	! log file output
		write(logUnit,*)
		write(logUnit,*)'    num threads used =',num_threads
		write(logUnit,*)
	end if
!$OMP END MASTER
!$OMP END PARALLEL
!
! close the log file after the parallel region
!
	if(logFlag.ge.debug_level)then
		close(unit=logUnit)
	end if

	timeFinish = OMP_GET_WTIME()

	if(logFlag.ge.debug_level)then	! log file output
		open(unit=logUnit,file=logFile,position='append')
		write(logUnit,*)
		write(logUnit,*)'    time duration =',timeFinish - timeStart
		write(logUnit,*)
		write(logUnit,*)'random integer values:'
		do i=1,numValues
			write(logUnit,*)i,values(i)
		end do
		close(unit=logUnit)
	end if
!
! call a routine in another DLL or shared object to output the random numbers,
! add an external dependency to this file for testing
!
!--parameters for accessing a Dynamic Link Library (DLL), should be ignored for a Linux shared object
!DEC$ATTRIBUTES DLLIMPORT :: output_values
!DEC$ATTRIBUTES ALIAS : 'output_values' :: output_values
!
	call output_values(values,dimValues,numValues,logFlag)
!
! Jump here on an error
!
900	continue

	if(logFlag.ge.debug_level)then	! log file output
		open(unit=logUnit,file=logFile,position='append')
		write(logUnit,*)
		write(logUnit,*)'----- End random_integer_openmp -----'
		close(unit=logUnit)
	end if

	return
	end subroutine random_integer_openmp

The first shared object Linux make file:

#!/bin/bash
#
# makefile to build the shared object library *.so file on Linux,
# expecting the Intel Fortran compiler environment variables to be initialized
#
# use command: make -f make_file_name
# for this make file use command:
#    make -f makefile_so_library.mak
#
# define some variables for source file and options,
# use "\" as a continuation line if needed
#
objects = random_integer_openmp.o
compiler = ifort
options = -qopenmp -shared -fPIC -static-intel -L. -Wl,-rpath,. -liomp5
#
# dependency libraries for other DLLs or shared objects called from this library
# on Linux:
# put the dependency shared object libraries in the same directory as the source code
#
libs = libOutputValues.so
#
# compiler options for Linux (similar to Windows),
# -fPIC for "position independent code", option is needed for each subroutine
# when building the shared object, specific to Linux,
# -shared to produce a dynamic shared object (*.so) instead of an executable,
# -static-intel to link Intel libraries statically to the compiled code,
# -O2 for optimization for speed,
# -fpconstant to evaluate single-precision constants as double precision,
# -warn declarations to check for undeclared names,
# -warn unused to check for declared variables that are never used,
# -liomp5 for the -l link option to include reference to the libiomp5.a library
# usually located in the /opt/intel/lib/intel64/ directory,
# -qopenmp enables multi-threaded code using OpenMP directives in the source code,
# include in both the compiling "opt1" and linking "options" makefile variables
#
# to search for other shared object files include the -L. option
# and the -Wl,-rpath,. option,
# the -L. option looks for shared libraries in the current directory "."
# the -Wl option passes the -rpath option to the linker
# the linker -rpath,. option looks for files in the current directory "."
#
opt1 = -qopenmp -fPIC -O2 -fpconstant -warn declarations -warn unused
#
# compile and link the source files to get the shared object library RandomNumbersOpenmp.so,
# add the "lib" prefix so that Linux can find the shared object library
#
RandomNumbersOpenmp.so: $(objects)
	$(compiler) -o libRandomNumbersOpenmp.so $(objects) $(options) $(libs)

random_integer_openmp.o: random_integer_openmp.f90
	$(compiler) -c random_integer_openmp.f90 $(opt1)

Windows make file:

#!/bin/bash
#
# makefile to build the dynamic link library *.dll file on Windows,
# open the initialized Intel Parallel Studio console (CMD) window so that
# the Fortran environment variables are set.
#
# use "nmake" command on Windows: nmake -f make_file_name
#
# define some variables,
# some of the options obtained from the Visual Studio project properties,
# use "\" as a continuation line
#
objects = random_integer_openmp.obj
compiler = ifort
options = /nologo /libs:dll /threads
#
# compiler options for Windows (similar to Linux),
# O2 for optimization for speed,
# fpconstant to evaluate single-precision constants as double precision
# warn:declarations to check for undeclared names,
# warn:unused to check for declared variables that are never used
# /Qopenmp for OpenMP multi-threading support
#
opt1 = /O2 /warn:declarations /warn:unused /fpconstant /Qopenmp
#
# compile and link the source files to build the dll "RandomNumbersOpenmp.dll",
# use the /dll option to get a DLL instead of an EXE,
# use the /exe:filename option to give the name of the DLL file
#
RandomNumbersOpenmp.dll: $(objects)
	$(compiler) /dll /exe:RandomNumbersOpenmp.dll $(objects) $(opt1) $(options)

random_integer_openmp.obj: random_integer_openmp.f90
	$(compiler) /c random_integer_openmp.f90 $(opt1)

The second shared object source code, output_values.f90:

!
! Output the array of integer values to a text file;
! use with the random numbers example program as an additional dependency for testing.
!
	subroutine output_values(values,dimValues,numValues,logFlag)
!
! Declare Variables
!
	implicit none
	integer, intent (in) :: dimValues		! values() array row size
	integer, intent (in) :: numValues		! number of values to write, expect numValues <= dimValues
	integer, intent (in) :: logFlag			! logFlag > 0 to activate the debug log file
	integer, intent (in) :: values(dimValues)		! array of values to write to text file
!
!--parameters for compiling as a Dynamic Link Library (DLL), should be ignored for a Linux shared object
!DEC$ATTRIBUTES DLLEXPORT :: output_values
!DEC$ATTRIBUTES ALIAS : 'output_values' :: output_values
!--additional attributes for use with Visual Basic or C#
! xx DEC$ATTRIBUTES STDCALL :: output_values
!DEC$ATTRIBUTES REFERENCE :: values,dimValues,numValues,logFlag
!
! Local Variables
!
	integer debug_level,row
	integer logUnit,mflag
	integer io
	character(len=512) logFile,message,outfile
!
! Initialize Values
!
	logFile = 'output_values_debug.log'		! log file name
	debug_level = 1							! write to debug log file when logFlag >= debug_level
	logUnit = 52							! log file unit number
	mflag = 0								! mflag=1=warning, 2=error, check message string
	message = ' '							! error or warning message

	if(logFlag.ge.debug_level)then	! log file output
		open(unit=logUnit,file=logFile,status='unknown')
		write(logUnit,*)'----- Begin output_values -----'
		write(logUnit,*)'      number of values =',numValues
		write(logUnit,*)'        maximum values =',dimValues
		close(unit=logUnit)
	end if
!
! Error checking
!
	if(numValues.gt.dimValues)then
		mflag = 2
		message = 'ERROR: too many values to output; numValues > dimValues (output_values).'
	end if

	if(logFlag.ge.debug_level)then	! log file output
		open(unit=logUnit,file=logFile,position='append')
		write(logUnit,*)
		write(logUnit,*)trim(message)
		write(logUnit,*)'                 mflag =',mflag
		write(logUnit,*)'      number of values =',numValues
		write(logUnit,*)'        maximum values =',dimValues
		write(logUnit,*)
		close(unit=logUnit)
	end if
	if(mflag.ge.2)goto 900
!
! open the output file and write the table of values
!
	io = 53
	outfile = 'output_value_table.txt'

	if(logFlag.ge.debug_level)then	! log file output
		open(unit=logUnit,file=logFile,position='append')
		write(logUnit,*)
		write(logUnit,*)'      output file unit =',io
		write(logUnit,*)'      output file name =',trim(outfile)
		close(unit=logUnit)
	end if

	open(unit=io,file=outfile,status='unknown')

	write(io,*)'---- table of integer values ----'
	write(io,*)
	write(io,*)'     number of values =',numValues
	write(io,*)
	write(io,1)'index','values'
1	format(a12,a16)

	do row=1,numValues
		write(io,2)row,values(row)
	end do
2	format(i12,i16)

	close(unit=io)
!
! Jump here on an error
!
900	continue

	if(logFlag.ge.debug_level)then	! log file output
		open(unit=logUnit,file=logFile,position='append')
		write(logUnit,*)'----- End output_values -----'
		close(unit=logUnit)
	end if

	return
	end subroutine output_values

The Linux make file:

#!/bin/bash
#
# makefile to build the shared object library *.so file on Linux,
# expecting the Intel Fortran compiler environment variables to be initialized
#
# use command: make -f make_file_name
# for this make file use command:
#    make -f makefile_so_library.mak
#
# define some variables for source file and options,
# use "\" as a continuation line if needed
#
objects = output_values.o
compiler = ifort
# test 1, omit the -qopenmp and -limp5 options for this library
#options = -shared -fPIC -static-intel -L. -Wl,-rpath,.
# test 2, omit -qopenmp and include -limp5 options for this library
#options = -shared -fPIC -static-intel -L. -Wl,-rpath,. -liomp5
# test 3, include -qopenmp and include -limp5 options for this library (source file does not use OpenMP directives)
options = -qopenmp -shared -fPIC -static-intel -L. -Wl,-rpath,. -liomp5
# 
#options = -qopenmp -shared -fPIC -static-intel -L. -Wl,-rpath,. -liomp5
#
# compiler options for Linux (similar to Windows),
# -fPIC for "position independent code", option is needed for each subroutine
# when building the shared object, specific to Linux,
# -shared to produce a dynamic shared object (*.so) instead of an executable,
# -static-intel to link Intel libraries statically to the compiled code,
# -O2 for optimization for speed,
# -fpconstant to evaluate single-precision constants as double precision,
# -warn declarations to check for undeclared names,
# -warn unused to check for declared variables that are never used,
#
# omit these options for this library:
# -liomp5 for the -l link option to include reference to the libiomp5.a library
# usually located in the /opt/intel/lib/intel64/ directory,
# -qopenmp enables multi-threaded code using OpenMP directives in the source code,
# include in both the compiling "opt1" and linking "options" makefile variables
#
# to search for other shared object files include the -L. option
# and the -Wl,-rpath,. option,
# the -L. option looks for shared libraries in the current directory "."
# the -Wl option passes the -rpath option to the linker
# the linker -rpath,. option looks for files in the current directory "."
#
# test 1, test 2, omit the -qopenmp option for this library
#opt1 = -fPIC -O2 -fpconstant -warn declarations -warn unused
# test 3, include the -qopenmp option for this library
opt1 = -qopenmp -fPIC -O2 -fpconstant -warn declarations -warn unused
#
#opt1 = -qopenmp -fPIC -O2 -fpconstant -warn declarations -warn unused
#
# compile and link the source files to get the shared object library OutputValues.so,
# add the "lib" prefix so that Linux can find the shared object library
#
OutputValues.so: $(objects)
	$(compiler) -o libOutputValues.so $(objects) $(options)

output_values.o: output_values.f90
	$(compiler) -c output_values.f90 $(opt1)

The Windows make file:

#!/bin/bash
#
# makefile to build the dynamic link library *.dll file on Windows,
# open the initialized Intel Parallel Studio console (CMD) window so that
# the Fortran environment variables are set.
#
# use "nmake" command on Windows: nmake -f make_file_name
# for the make file use:
#    nmake -f makefile_dll_library_win.mak
#
# define some variables,
# some of the options obtained from the Visual Studio project properties,
# use "\" as a continuation line
#
objects = output_values.obj
compiler = ifort
options = /nologo /libs:dll /threads
#
# compiler options for Windows (similar to Linux),
# O2 for optimization for speed,
# fpconstant to evaluate single-precision constants as double precision
# warn:declarations to check for undeclared names,
# warn:unused to check for declared variables that are never used
# /Qopenmp for OpenMP multi-threading support, omit for this library
#
opt1 = /O2 /warn:declarations /warn:unused /fpconstant
#
# compile and link the source files to build the dll "OutputValues.dll",
# use the /dll option to get a DLL instead of an EXE,
# use the /exe:filename option to give the name of the DLL file
#
OutputValues.dll: $(objects)
	$(compiler) /dll /exe:OutputValues.dll $(objects) $(opt1) $(options)

output_values.obj: output_values.f90
	$(compiler) /c output_values.f90 $(opt1)

 

Thanks for your help.

Regards,

Greg T.

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,780 Views

>>Would the requirement to compile with -qopenmp extend to other subroutines that do not have a parallel region

Yes. I though I made that clear in my post.

subrouting foo
real :: array(1234)

 

When the above is compiled without -qopenmp .OR. without -recursive .OR. witiout -auto=... .OR. some additional other switches,

*** array is SAVE - meaning it is shared

When compiled with (one or more of) those options, the array is on stack (private to thread), or optionally allocated from heap (private to thread).

For threaded programs, you will generally want those arrays (and user defined types) located on the stack (or heap).

>>On Windows if I omit the /Qopenmp for the second DLL the program runs correctly

Your program may have completed by chance. To quote "Dirty Harry": Do you feel lucky?

RE: C# and OpenMP

You must be careful when programming a C# application the calls OpenMP code. C# applications tend to create dozens, hundreds, thousands of threads. Not much of a problem when you stay inside C#. When calling OpenMP, each different thread (C# or other created thread) that calls a library (Fortran or C or C++) that contains OpenMP, each thread will instantiate its own OpenMP thread pool. If your system has 8 logical processors, and C# generated 100 different threads calling your library, then you will have 100 * 8 threads in 100 different OpenMP thread pools.

The fix, is to code your C# such that only on C# thread calls the OpenMP code.

Jim Dempsey

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,780 Views

Greg,

Nothing in your sample code stands out as in error except for in random_integer_openmp.f90, line 143 should have !$OMP BARRIER as it may be possible for other thread to enter do loop, reach the critical section, and now you have two threads writing to the log file. This should not have seg faulted the program, but the outputs could have mingled. You should be aware that the generated random numbers written to the array will not be determinant. Meaning can vary from run to run.

Have you run this with full debug runtime diagnostics?

Jim Dempsey

0 Kudos
Greg_T_
Valued Contributor I
1,780 Views

Hi Jim,

Thank you for the additional information and confirmation.  It is good to know that compiling the second DLL on Windows without the /Qopenmp option may have just been lucky to complete.  Using the test program on Linux has helped reveal the compiling issue that I need to address.  I'll pursue building the other "big" libraries with /Qopenmp to see if that will allow the big program to run on both OS.

I'll add the !$OMP BARRIER directive after the !$OMP END MASTER and before !$OMP DO PRIVATE and run it on both OS.  Usually I would not put the log file output within the parallel loop, just for testing, and would remove or comment out the !$OMP CRITICAL and file output for release.

I'll compile with more run time diagnostics and test to see if that will provide more information when the second routine is not compiled with openmp.  I'll use /traceback and /check:all with optimization turned off /O0.  Are there other run time diagnostics that would help?

Regards,
Greg T.

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,780 Views

Hi Greg,

Something to keep in mind.

!$OMP DO ... !$OMP END DO has implicit barrier at END DO, but not at !$OMP DO
!$OMP DO ... !$OMP END DO NOWAIT does not have implicit barrier at END DO, and not at !$OMP DO

>>I'll compile with more run time diagnostics and test to see if that will provide more information when the second routine is not compiled with openmp.

Any errors reported by doingt the above are likely due to be misleading if these errors do not also show up with /qopenmp.

Jim

 

 

0 Kudos
Greg_T_
Valued Contributor I
1,780 Views

Hi Jim,

Adding the !$OMP BARRIER runs well on both Windows and Linux.  I found I could sometimes get I/O errors in the log file that you described on some test runs if the barrier wasn't included before the loop.  In reading about OpenMP directives, I don't remember seeing the clarification that there is not an implied barrier at the start of !$OMP DO, so that is very good to know.

I was curious if adding run time diagnostic options: -traceback -check all, in the make file to build both libraries would list any more information, such as a call stack, for the segmentation fault error on Linux.  No additional information was written for the test case that causes the crash (second shared object library not compiles with -qopenmp or -liomp5).  It was useful to have a with and without OpenMP test case to narrow down the problem.

I feel I have improved understanding now and will work on recompiling all the shared object libraries in the "big" program to see if I can get it to run on Linux with OpenMP.

Thanks for your help.

Regards, Greg T.

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,780 Views

OpenMP API - Version 4.5 November 2015, Page 58 statement 24:

There is an implicit barrier at the end of a loop construct unless a nowait clause is specified.

The above does not state anything about the start of the loop. This makes it implementation dependent. As a practical implementation detail, if (when) I were to use the nowait clause, I would also want the start of (any) next loop in the same region to begin immediately (without a start barrier).

Good luck on your "big" program.

Jim Dempsey

0 Kudos
Greg_T_
Valued Contributor I
1,780 Views

Problem solved.  Recompiling the other Fortran shared object (*.so) libraries with the options -qopenmp -liomp5 did solve the run time segmentation fault on Linux for the "big" program.  With all four Fortran shared object libraries compiled with the same OpenMP options it runs correctly.  I found that I needed to put the libiomp5.so library in the directory with the executable and *.so libraries.

I also recompiled the DLLs on Windows (same Fortran source code) with /Qopenmp and it runs correctly there too.  Having a single set of source code that compiles and runs on both Linux and Windows is still working with OpenMP added.

Regards, Greg T.

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,780 Views

>> I found that I needed to put the libiomp5.so library in the directory with the executable and *.so libraries

You can leave the .so libraries where they are and use the source ... that you use for development .OR.

Append the path to these libraries onto LD_LIBRARY_PATH (for Linux/Mac) or PATH for Windows

Note, if the wrong version is in LD_LIBRARY_PATH then you have an installation issue.

Jim Dempsey

0 Kudos
Greg_T_
Valued Contributor I
1,780 Views

Updating the LD_LIBRARY_PATH on my computer would work.  The libiomp5.so library is in the /opt/intel/lib/intel64/ directory.  When I distribute the program to the clients, I am not expecting them to have Intel Fortran installed, so would updating the library path still work to provide the libiomp5.so file on other Linux computers?  To distribute our program on Windows I can use the Intel redistributable installer to provide the dependencies.  I've been looking for a similar redistributable package for Linux, but haven't found it yet in the Fortran downloads. I feel that I'm looking in the wrong places or am missing an important detail.

I have used the -shared-intel option to include the Intel libraries when building the shared object libraries.  That worked well to distribute the program to the clients without needing to add the Intel dependency libraries.  Now that OpenMP is added, is there a reason the libiomp5 library does not get included when using the -shared-intel option?  Perhaps the libiomp5 library doesn't have a static version, or I've missed seeing a warning message?

Thanks for your help.

Regards, Greg T.

0 Kudos
Kevin_D_Intel
Employee
1,780 Views

The redistributables are here, https://software.intel.com/en-us/articles/intelr-composer-redistributable-libraries-by-version

Yes, there is no static OpenMP library. There was one several releases ago.

0 Kudos
Steve_Lionel
Honored Contributor III
1,780 Views

I thought that they kept the static OpenMP libraries on Linux. Indeed they went away on Windows several releases ago. 

0 Kudos
Kevin_D_Intel
Employee
1,780 Views

Beg your pardon. Steve is correct. We do still provide the static libiomp for Linux.

From what I'm seeing 17.0 update 4, the libiomp is linked dynamic by default and also with -shared-intel and statically with -static-intel; however, there's an error when using static with a shared lib.

Linking with -shared-intel did not change linking to the shared version for a dynamic lib:

$ ifort -V
Intel(R) Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 17.0.4.196 Build 20170411
Copyright (C) 1985-2017 Intel Corporation.  All rights reserved.

$ ifort -qopenmp -fpic -shared -o a.out -shared-intel sample_omp.f90

$ file a.out
a.out: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=0x2f454599fa8337a7fd705b25c5990a90e46d3620, not stripped

$ ldd a.out
        linux-vdso.so.1 =>  (0x00007fffcafb9000)
        libifport.so.5 => /opt/intel/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64/libifport.so.5 (0x00007fbf30321000)
        libifcoremt.so.5 => /opt/intel/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64/libifcoremt.so.5 (0x00007fbf2ff90000)
        libimf.so => /opt/intel/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64/libimf.so (0x00007fbf2faa3000)
        libsvml.so => /opt/intel/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64/libsvml.so (0x00007fbf2eb8a000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fbf2e82b000)
        libiomp5.so => /opt/intel/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64/libiomp5.so (0x00007fbf2e487000)
        libintlc.so.5 => /opt/intel/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64/libintlc.so.5 (0x00007fbf2e21c000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fbf2dfff000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fbf2dc3e000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fbf2da28000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fbf2d823000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fbf30754000)

And while I can link to the static version for an executable:

$ ifort -qopenmp -o a.out -static-intel sample_omp.f90
$ ldd a.out
        linux-vdso.so.1 =>  (0x00007ffff9dfe000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f500912a000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f5008f0d000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f5008b4c000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f5009489000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f5008936000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f5008731000)

It fails when doing so with a shared library for reasons I haven't found an explanation for:

$ ifort -qopenmp -fpic -shared -o a.out -static-intel sample_omp.f90
ld: a.out: version node not found for symbol omp_get_proc_bind_@@VERSION
ld: failed to set dynamic section sizes: Bad value

 

0 Kudos
Greg_T_
Valued Contributor I
1,780 Views

Hi Kevin,

Thank you for the link to the redistributable packages, and for checking on the possibility of statically linking libiomp5 to a shared object. It would be convenient to statically link to libiomp5 for my shared object libraries, but I can use the libiomp5.so from the redistributable for now.  Knowing that you found an issue with linking to shared objects is useful to know why I need the libiomp5.so library.  Most of my Fortran projects are compiled as shared object libraries and using -static-intel to have all the Intel dependencies included has been very convenient.

Regards, Greg T.

0 Kudos
Kevin_D_Intel
Employee
1,780 Views

I found some Fortran cases related to symbols generated for submodule routines (here and here) where the use @ was problematic and led to similar link issues so it is possible we uncovered an issue with our static OpenMP library. I submitted my test case and findings to Development for them to investigate further.

(Internal tracking id: CMPLRS-43381)

0 Kudos
Reply