I am asking this question because I have had this strange problem for days. The code in question is parallelized using openmp directives, and runs well on my pc, a linux box that has only one cpu, but nevertheless I have set the enviroment variable OMP_NUM_THREADS to be 4 (or more). Now if I compile the same code on an SGI altix 3000, the results are drastically different and not correct. Both my pc and the SGI machine use ifort as compilor.
Should the fact that a code works on a single-CPU machine with multi-threads specified guarantee a correct behavior on a multi-processor machine? Or actually there is no relavance at all?
Any suggestion will be appreciated.