Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
28383 Discussions

Difference in performance of a COARRAY Fortran example on two similar PCs

avinashs
New Contributor I
2,817 Views

I ran a simple COARRAY Fortran example on two Windows 10 machines with the same project files (so exactly the same compiler settings). However, the output is different on both machines as shown below. Is this expected behavior? Any help will be appreciated, especially with settings that need to be changed.

Computer 1: Intel i7 4770K,    16 GB RAM, Cores = 4, Threads = 8, special order through an engineering software provider
Computer 2: Intel i7 6820HQ, 32 GB RAM, Cores = 4, Threads = 8, special order directly from Dell

The program code is:

program main
  ! Test COARRAY Fortran 2008
  if (this_image() == 1) then
     write(*,'(1x,a,1x,i0,1x,a)') 'Coarray Fortran program running with', num_images(), 'images'
  end if
  sync all
  write(*,'(1x,a,1x,i0)') 'Hello from image', this_image()
  if (this_image() == 1) read *
1 continue
end program main

The output on Computer 1 is as advertised in the tutorial:

         Coarray Fortran program running with 8 images
         Hello from image 1
         Hello from image 5
         Hello from image 2
         Hello from image 6
         Hello from image 3
         Hello from image 4
         Hello from image 7
         Hello from image 8

However, the output on Computer 2 is different as seen below and reports that only 1 image is used.

         Coarray Fortran program running with 1 images
         Hello from image 1
         Coarray Fortran program running with 1 images
         Hello from image 1
         Coarray Fortran program running with 1 images
         Hello from image 1
         Coarray Fortran program running with 1 images
         Hello from image 1
         Coarray Fortran program running with 1 images
         Hello from image 1
         Coarray Fortran program running with 1 images
         Hello from image 1
         Coarray Fortran program running with 1 images
         Hello from image 1
         Coarray Fortran program running with 1 images
         Hello from image 1
         
I will add that Computer 2 in general runs slower than Computer 1 on all Fortran applications although it is a newer and potentially superior computer.
 

0 Kudos
48 Replies
Lorri_M_Intel
Employee
873 Views

OK.  So it looks like 32-bit on Computer2 is the "bad" combination.   Correct?

Let me give you an interesting set of commands to try.

Start up one of the Parallel Studio command windows, the one labeled "IA-32",

In that window issue the command:
   mpiexec -V

What was the result?

Also in that window, issue this command:
         set FOR_COARRAY_DEBUG_STARTUP=TRUE

and then run your executable.   Please post the output of that; I really want to see the 'mpiexec' line, and I want to watch it generate the wrong behavior (8 x 1 image)

In that same window, compile your program as this:
ifort /Qcoarray=single myprogram.f90

Now run your program.   I expect you will get one executable running one image.   (that's what the "single" keyword does).   I don't need to see that output if that's what happens.

Final step, and I'll want to see the output please.

The 'mpiexec' line you saw above -- apply that to that "single" executable you just created.   What happened?  Did it generate one executable with 8 images, or 8 executables with only one image?

       Thanks -

                                   --Lorri

0 Kudos
avinashs
New Contributor I
873 Views

Thanks, Lorri. I am posting the output of the three tests you requested above. The last test did not work.

0 Kudos
avinashs
New Contributor I
873 Views
Copyright (C) 1985-2018 Intel Corporation. All rights reserved.
Intel(R) Compiler 19.0 Update 1 (package 144)

**********************************************************************
** Visual Studio 2017 Developer Command Prompt v15.8.4
** Copyright (c) 2017 Microsoft Corporation
**********************************************************************
[vcvarsall.bat] Environment initialized for: 'x86'

C:\Program Files (x86)\IntelSWTools>mpiexec -V
Intel(R) MPI Library for Windows* OS, Version 2019 Update 1 Build 20181016 (id: 1f6a76f43)
Copyright 2003-2018, Intel Corporation.

C:\Program Files (x86)\IntelSWTools>set FOR_COARRAY_DEBUG_STARTUP=TRUE

C:\Temp>ifort /Qcoarray:shared -o mcpica32 mcpi_coarray_final.f90
Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on IA-32, Version 19.0.1.144 Build 20181018
Copyright (C) 1985-2018 Intel Corporation.  All rights reserved.

Microsoft (R) Incremental Linker Version 14.15.26729.0
Copyright (C) Microsoft Corporation.  All rights reserved.

-out:mcpica32.exe
-subsystem:console
mcpi_coarray_final.obj

C:\Temp>mcpica32
Generated MPI command line is 'mpiexec.exe -localonly -n 8 mcpica32 '.
Computing pi using 1800000000 trials across 1 images
Computing pi using 1800000000 trials across 1 images
Computing pi using 1800000000 trials across 1 images
Computing pi using 1800000000 trials across 1 images
Computing pi using 1800000000 trials across 1 images
Computing pi using 1800000000 trials across 1 images
Computing pi using 1800000000 trials across 1 images
Computing pi using 1800000000 trials across 1 images
Computed value of pi is 3.1415999, Relative Error: .232E-05
Elapsed time is 122. seconds
Fortran Pause - Enter command<CR> or <CR> to continue.
Computed value of pi is 3.1415592, Relative Error: .107E-04
Elapsed time is 123. seconds
Fortran Pause - Enter command<CR> or <CR> to continue.
Computed value of pi is 3.1415798, Relative Error: .410E-05
Elapsed time is 123. seconds
Fortran Pause - Enter command<CR> or <CR> to continue.
Computed value of pi is 3.1416628, Relative Error: .223E-04
Elapsed time is 123. seconds
Fortran Pause - Enter command<CR> or <CR> to continue.
Computed value of pi is 3.1415916, Relative Error: .324E-06
Elapsed time is 123. seconds
Fortran Pause - Enter command<CR> or <CR> to continue.
Computed value of pi is 3.1415592, Relative Error: .107E-04
Elapsed time is 123. seconds
Fortran Pause - Enter command<CR> or <CR> to continue.
Computed value of pi is 3.1415771, Relative Error: .494E-05
Elapsed time is 123. seconds
Fortran Pause - Enter command<CR> or <CR> to continue.
Computed value of pi is 3.1415621, Relative Error: .973E-05
Elapsed time is 123. seconds
Fortran Pause - Enter command<CR> or <CR> to continue.

 

0 Kudos
avinashs
New Contributor I
873 Views
C:\Temp>ifort /Qcoarray=single -o mcpica32_single mcpi_coarray_final.f90
Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on IA-32, Version 19.0.1.144 Build 20181018
Copyright (C) 1985-2018 Intel Corporation.  All rights reserved.

Microsoft (R) Incremental Linker Version 14.15.26729.0
Copyright (C) Microsoft Corporation.  All rights reserved.

-out:mcpica32_single.exe
-subsystem:console
mcpi_coarray_final.obj
Fortran Pause - Enter command<CR> or <CR> to continue.

C:\Temp>mcpica32_single

C:\Temp>Computing pi using 1800000000 trials across 1 images
Computed value of pi is 3.1415347, Relative Error: .185E-04
Elapsed time is 74.5 seconds
Fortran Pause - Enter command<CR> or <CR> to continue.

Fortran Pause - Enter command<CR> or <CR> to continue.

 

0 Kudos
avinashs
New Contributor I
873 Views
C:\Temp\mpiexec.exe -localonly -n 8 mcpica32_single

No result after 15 min

0 Kudos
Lorri_M_Intel
Employee
873 Views

I'm going to assume this isn't an actual cut-n-paste from your window, because it says "C:\TEMP\" instead of "C:\TEMP>"

Something else I'd like you to try, please.

While you're in C:\Temp, issue the "mpiexec -V" command again.   Does it match what you saw in #24?

If it did match, again, while in C:\Temp, change the command to:
                mpiexec.exe -v -localonly -n 8 mcpica32_single

 

  Thank you for your patience; we'll figure this out yet!

                           --Lorri

(that's a lower-case "V", which tells mpiexec to be verbose, and yes, it's verbose)

 

0 Kudos
avinashs
New Contributor I
873 Views

Lorri Menard (Intel) wrote:

I'm going to assume this isn't an actual cut-n-paste from your window, because it says "C:\TEMP\" instead of "C:\TEMP>"

It is an actual cut-and-paste of the window but the path with the project file has my name and that of my company so I replaced it with C:\Temp.

 

0 Kudos
avinashs
New Contributor I
873 Views

Issuing command mpiexec -V from C:\temp

C:\temp>mpiexec -V
Intel(R) MPI Library for Windows* OS, Version 2019 Update 1 Build 20181016 (id: 1f6a76f43)
Copyright 2003-2018, Intel Corporation.

It appears to be the same as in #24.

 

0 Kudos
avinashs
New Contributor I
873 Views

Issuing the command: mpiexec.exe -v -localonly -n 8 mcpica32_single

1. CPU utilization jumped to 100%

2. Output produced after ~ 2 minutes but program did not terminate  i.e. C:\temp prompt not recovered. However, CPU usage dropped to 0.

3. After pressing Enter key a few messages appeared. Then CPU utilization went back to 100%.

4. Window was attempted to be forcibly shut down with Task Manager but that failed with an error. Computer was finally shutdown.

5. Output is below (Note: ComputerName has replaced the real computer name).

0 Kudos
avinashs
New Contributor I
873 Views
C:\temp>mpiexec.exe -v -localonly -n 8 mcpica32_single
Computing pi using 1800000000 trials across 1 images
Computing pi using 1800000000 trials across 1 images
Computing pi using 1800000000 trials across 1 images
Computing pi using 1800000000 trials across 1 images
Computing pi using 1800000000 trials across 1 images
Computing pi using 1800000000 trials across 1 images
Computing pi using 1800000000 trials across 1 images
Computing pi using 1800000000 trials across 1 images
Computed value of pi is 3.1414695, Relative Error: .392E-04
Elapsed time is 124. seconds
Fortran Pause - Enter command<CR> or <CR> to continue.
Computed value of pi is 3.1416288, Relative Error: .115E-04
Elapsed time is 124. seconds
Fortran Pause - Enter command<CR> or <CR> to continue.
Computed value of pi is 3.1415333, Relative Error: .189E-04
Elapsed time is 124. seconds
Fortran Pause - Enter command<CR> or <CR> to continue.
Computed value of pi is 3.1416081, Relative Error: .491E-05
Elapsed time is 125. seconds
Fortran Pause - Enter command<CR> or <CR> to continue.
Computed value of pi is 3.1415741, Relative Error: .589E-05
Elapsed time is 124. seconds
Fortran Pause - Enter command<CR> or <CR> to continue.
Computed value of pi is 3.1416599, Relative Error: .214E-04
Elapsed time is 124. seconds
Fortran Pause - Enter command<CR> or <CR> to continue.
Computed value of pi is 3.1416305, Relative Error: .120E-04
Elapsed time is 125. seconds
Fortran Pause - Enter command<CR> or <CR> to continue.
Computed value of pi is 3.1415867, Relative Error: .188E-05
Elapsed time is 125. seconds
Fortran Pause - Enter command<CR> or <CR> to continue.


[proxy:0:0@ComputerName] ..\windows\src\hydra_sock.c (379): write error (errno = 0)
[proxy:0:0@ComputerName] proxy_cb.c (256): error writing data
[proxy:0:0@ComputerName] ..\windows\src\hydra_demux.c (203): callback returned error
[proxy:0:0@ComputerName] proxy.c (989): error waiting for event

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
873 Views

On my system when program hangs, and system hangs, and TaskManager appears to fail to launch, the culprit is almost always insufficient RAM on the system running the application. IOW the sum of the compute bound VM's memory requirements exceed the physical RAM. Windows scheduler (IMHO) is somewhat brain dead in regard to this condition. IOW all of page able RAM can be consumed by a single user application with none held in reserve for other processes.

Jim Dempsey

0 Kudos
avinashs
New Contributor I
873 Views

I am wondering whether there was any update on this problem. However, having installed IVF 2019 Update 2 (19.2.190) today, I find that Computer 1 is now behaving exactly like Computer 2 i.e. both computers are not running in the prescribed way for the coarray Fortran examples.

0 Kudos
Steve_Lionel
Honored Contributor III
873 Views

And for me, update 2 is getting an access violation in the mcpi example, update 1 works. Sigh. Another issue will be submitted.

0 Kudos
Devorah_H_Intel
Moderator
873 Views

Since this topic was mentioned in this thread  - I am posting the results for coarray sample built with 19.0.4  Intel Fortran Compiler.
 

1>Deleting intermediate files and output files for project 'coarray_samples', configuration 'Debug|x64'.
1>Compiling with Intel(R) Visual Fortran Compiler 19.0.4.228 [Intel(R) 64]...
1>mcpi_coarray_final.f90
1>Linking...
1>Embedding manifest...
1>
1>Build log written to  "file://...\compiler_f\coarray_samples\msvs\x64\Debug\BuildLog.htm"
1>coarray_samples - 0 error(s), 0 warning(s)
========== Rebuild All: 1 succeeded, 0 failed, 0 skipped ==========


Computing pi using 1800000000 trials across 16 images
Computed value of pi is 3.1415616, Relative Error: .989E-05
Elapsed time is 3.79 seconds
Press any key to continue . . .

System info: Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz, 2295 Mhz, 18 Core(s), 36 Logical Processor(s)

0 Kudos
avinashs
New Contributor I
873 Views

I have still not migrated to 19.4. However, coarray Fortran has completely stopped working on both machines with 19.2 - the computer CPU usage goes to 100% and the processes have to be shutdown with Task Manager or the machine rebooted. I have been trying to run the simple examples quoted above. I want to temporarily install IVF on some other machine that is not being used to see if this is a universal problem with CAF on all my machines.

I have now also received version 19.4 with my annual upgrade so I will try that out next.

0 Kudos
Lorri_M_Intel
Employee
873 Views

Refresh my memory please -- are you creating a 32-bit target executable, or 64-bit?

     thanks --

 

0 Kudos
avinashs
New Contributor I
873 Views

Lorri Menard (Intel) wrote:

Refresh my memory please -- are you creating a 32-bit target executable, or 64-bit?

     thanks --

 

I have been using both versions - Win32 and x64. At the current time, I am only trying to study IVF CAF for future use so the tests are with the sample problems.

0 Kudos
Lorri_M_Intel
Employee
873 Views

Thanks for getting back to me.

First - please don't use 32-bit coarrays.   We've discovered serious problems in that configuration (which of course you've seen), and yes, I realize that the default configuration in Visual Studio is 32bit.  We'll be deprecating its use in a future release.

In the 64-bit environment - how many images are you using?  Did you set a number, or did you let the system pick a number?

If you let the system pick, do you get a happy result?

 

0 Kudos
avinashs
New Contributor I
873 Views

Lorri Menard (Intel) wrote:

Thanks for getting back to me.

First - please don't use 32-bit coarrays.   We've discovered serious problems in that configuration (which of course you've seen), and yes, I realize that the default configuration in Visual Studio is 32bit.  We'll be deprecating its use in a future release.

In the 64-bit environment - how many images are you using?  Did you set a number, or did you let the system pick a number?

If you let the system pick, do you get a happy result?

 

I reran the tests and here are the results:

1. The test program fails in MSVS2017 (for the record, both Win32 and x64 fail, in Release and Debug modes). CPU usage jumps to 100%.

2. I ran from the x64 command line and the 64-bit version runs successfully (all 8 images report correctly).

3.  I ran from the IA32 command line and the program runs. However, only image 1 reports 8 times as opposed to 8 images reporting once.

The command line is 

ifort /nologo /O2 /Qcoarray:shared /libs:dll /threads /dbglibs coarray1.f90

 

0 Kudos
Lorri_M_Intel
Employee
831 Views

This is actually encouraging news!

It means that your "only" problem is running coarrays within your Visual Studio environment.  

I believe that the problem is that the "executable path" found *inside* Visual Studio does not have Intel MPI early enough in the list that it is used instead of another  MPI on your system.   The "executable path" is initially derived from the system PATH variable (plus other VS-specific directories) and adjustments made in other command windows will not affect it.

       What do you see when you expand Project->Properties->Configuration->VC++ Directories, and the Executable Path item.   Is Intel MPI on that list?

It is possible to modify the list, and move directories up-and-down --- maybe experiment with that?   In the 64-bit environment, of course.

 

 

 

 

0 Kudos
avinashs
New Contributor I
831 Views

Lorri Menard (Intel) wrote:

I believe that the problem is that the "executable path" found *inside* Visual Studio does not have Intel MPI early enough in the list that it is used instead of another  MPI on your system.   The "executable path" is initially derived from the system PATH variable (plus other VS-specific directories) and adjustments made in other command windows will not affect it.

       What do you see when you expand Project->Properties->Configuration->VC++ Directories, and the Executable Path item.   Is Intel MPI on that list?

It is possible to modify the list, and move directories up-and-down --- maybe experiment with that?   In the 64-bit environment, of course.

 

 

 

 

There is no option for VC++ directories under Project->Properties->Configuration Properities. This option does appear for C++ projects but not for IVF projects.

0 Kudos
Reply