WRITE and string compares acquiring locks and degrading multicore performance!
I am using OpenMP with FORTRAN to parallelize an algorithm.
I was getting poor performance for higher number of cores ( > 3). Some analysis revealed that the culprit is WRITE statement I used in parallel section to transfer data. Apparently WRITE acquires a lock and thus degrades performance. I replaced the WRITE statement with my routine to transfer data and the problem went away.
I also noticed a similar problem while doing string compares in FORTRAN where it appeared to acquire a lock degrading performance at higher number of cores.
Did any one has seen a similar issue? Is this behavior expected of WRITE and string compares in FORTRAN? (I could not find anything to this effect in refence manual of FORTRAN).
WRITEs to the same unit (including *) would be serialized. If each thread WRITEs to a different unit, serialization may depend on which OpenMP version is involved, and may be affected by other options, such as Intel Fortran buffered_io. Serialization of string compare ought to be avoidable, but certainly there would be a number of possible ways in which threading scaling may be broken. Did you check your code with a tool such as Intel Thread Checker, and did you take advantage of the affinity tools of your platform?
Yes. I did check the code with Intel Thread Check (and earlier through VTune) and that's how pin pointed the problem. I would have imagined this (acquisitin of the lock that is) to be possible for WRITE but not for _for_cpstr.
If anyone aware of a list of FORTRAN functions (like WRITE) which would result in acquisition of a lock?