In a large, old, code that I have no real control over I ran into the above error message.
I located the string to libmpi but I am interested in what situations the error appears and what can be done to work around it.
For this particular application I can make the code run to completion by compiling the code with -O0 and then the message does not occur until the very end of the program. In this case it appears to be non-fatal because the code appears to end normally. I can not tell if the error occurs during the final IO phase or after the end of the program.
If I compile the code with -O1 and some other optimization restrictions, the error occurs after a few iterations in what appears to be a time history data output phase. It is unclear of the NaN that are reported in the output occur before the warning message or after.
To be fully transparent this was from executions on Intel MPI 2019.5, but it appears to be consistent over 2015, 2016 up to this one and I will revalidate the more recent releases now.
Thanks for posting in Intel Communities.
Could you please provide us with the OS details, sample reproducer code along with the steps to reproduce the issue to investigate more from our end?
And also, could you please provide us with a screenshot of the error?
Could you please provide us with the debug logs by using the below command?
I_MPI_DEBUG=30 FI_LOG_LEVEL=debug mpirun -n <no-of-processes> ./a.out
>>I will revalidate the more recent releases now.
Could you please let us know if it works with the latest Intel MPI(2021.5)?
Thanks & Regards,