Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

Wrong results by Intel MPI 4.0.3.006?

jackyjngwn
Beginner
369 Views

Hi

I am using Intel MPI 4.0.3.006 to run my application, and found that the generated outputs were different from those generated using older version of MPI. I tried with MPICH2-1.4 and got the same wrong outputs. Has anyone seen similar problem and how can I fix it?

Thanks.

 

 

 

0 Kudos
5 Replies
James_T_Intel
Moderator
369 Views
Hi Jacky, Can you please give some more information regarding what outputs are changed? What previous version are you using for comparison? Are you using the same compiler? What operating system are you using? Did you really mean 4.0.3.006? Sincerely, James Tullos Technical Consulting Engineer Intel® Cluster Tools
0 Kudos
jackyjngwn
Beginner
369 Views
I am using Linux Redhat 5, and comparing the results given by MPICH2-1.0.4 and Intel MPI 4.0.3.006, the one with the hydra process manager. I used the same compiler, gcc 4.2.4. I also tried gcc 4.6.1 and intel compiler 12.1.0, but got the same outputs, which were incorrect. Thanks.
0 Kudos
James_T_Intel
Moderator
369 Views
Hi Jacky, You should be using 4.0.3.008, rather than 4.0.3.006. What application are you using? Do you have a small reproducer code for this problem? Sincerely, James Tullos Technical Consulting Engineer Intel® Cluster Tools
0 Kudos
jackyjngwn
Beginner
369 Views
James, I think I got this version if Intel MPI as a test package before 4.0.3.008 was released.But I got the same results using 4.0.2.003. The application I am using is quite complicated, so it's almost impossible for me the reproduce it. Looking at the output, it seems as if some input data were not propagated to certain nodes or some nodes did not send their results back to the master node. However, I did not get any MPI error message. Have you ever seen or heard of similar problem? Or could you suggest a way for me to pinpoint the cause? Someone suggested to me it might be caused by a confusion of "big endian" and "small endian". How can I check that then? Thanks. Thanks!
0 Kudos
James_T_Intel
Moderator
369 Views
Hi Jacky, I would actually recommend trying the latest version of the Intel® MPI Library, Version 4.1. Have you tried using the message checking library? Use -check with mpirun to enable this. Are you using a homogeneous cluster? Sincerely, James Tullos Technical Consulting Engineer Intel® Cluster Tools
0 Kudos
Reply