Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2234 Discussions

to seek some examples for hybrid Intel MPI and OPenMP codes

dingjun_chencmgl_ca
2,379 Views
Hi, Could someone tell me where I can get some examples for the hybrid Intel MPI and OpenMP codes? I want to study them. Thanks inadvance.

Dingjun
0 Kudos
13 Replies
James_T_Intel
Moderator
2,379 Views

Hi Dingjun,

We have a beginner level tutorial available at http://software.intel.com/en-us/articles/beginning-hybrid-mpiopenmp-development/. Additional resources can be found by a standard internet search.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

0 Kudos
dingjun_chencmgl_ca
2,379 Views

Hi, James,

Thank you very much for your reply.

I have downloaded it and built it with MS visual Studio 2010. but there are following errors:

Error 1 error LNK2019: unresolved external symbol OMP_SET_NUM_THREADS referenced in function MAIN__ hybrid_hello.obj

Error 2 error LNK2019: unresolved external symbol OMP_GET_THREAD_NUM referenced in function MAIN__ hybrid_hello.obj

Error 3 error LNK2019: unresolved external symbol OMP_GET_NUM_THREADS referenced in function MAIN__ hybrid_hello.obj

Error 4 fatal error LNK1120: 3 unresolved externals x64\Debug\IntelhybridMPIOpenMPHello.exe



What's up? Please let me know your points.

Dingjun
0 Kudos
James_T_Intel
Moderator
2,379 Views
Hi Dingjun,

Do you have OpenMP* enabled in your project?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools
0 Kudos
dingjun_chencmgl_ca
2,379 Views


Thanks to James. I get the OpenMP directive enable in Language option of Fortran, then link errors are gone.

Is there any difference between the following MPI initialization? Which one is better?

call MPI_INIT_THREAD(required, provided, ierr)

call MPI_INIT( ierr )

I look forward to hearing from you again.

Dingjun
0 Kudos
James_T_Intel
Moderator
2,379 Views
Hi Dingjun,

I would recommend using MPI_INIT_THREAD, as this verifies linkage with the thread-safe MPI library. Set required to the level you need, and check that provided is at least as high as that level. The levels are:

MPI_THREAD_SINGLE - Only one thread
MPI_THREAD_FUNNELED - The process may be multi-threaded, but only the main thread makes MPI calls
MPI_THREAD_SERIALIZED - The process may be multi-threaded and multiple threads may make MPI calls, but only one at a time.
MPI_THREAD_MULTIPLE - Multiple threads may call MPI simultaneously.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools
0 Kudos
TimP
Honored Contributor III
2,379 Views
MPI standard requires MPI_init_thread in place of MPI_init for any case but MPI_THREAD_SINGLE, but customers expect MPI_init to work also for the MPI_THREAD_FUNNELED case. As James said, it's better form to use MPI_init_thread and verify that the MPI implementation claims to support the model you require.
0 Kudos
dingjun_chencmgl_ca
2,379 Views


Thanks again for your reply.


Couldyou tell me how to handle the command line parameter of application executable in case of launching an executable application with MPIEXEC?

For example:

Imex.exe is originally an OpenMP hybrid parallel application executable. Before I introduce MPI to implement the hybrid MPI/OpenMP application, it is run like that:

Imex.exe -f cputest_rb_samg_best_the1stFAST2.dat -log -fgmres

But right now, I need to launch it with MPIEXEC. For example, I am using Intel MPI and plan to run it on two host nodes: Dingjunc and Sim-west. I need to enter as follow:

mpiexec -hosts 2 dingjunc 2 sim-west 2 imex.exe

How can I pass the following parameters into imex.exe?

-f cputest_rb_samg_best_the1stFAST2.dat -log -fgmres


Imex.exe cannot run well without above parameters.

I look forward to your suggestions.


Dingjun Chen

0 Kudos
James_T_Intel
Moderator
2,379 Views
Hi Dingjun,

When running with mpiexec, simply pass the program arguments after the program command, as you usually would:

[plain]mpiexec -hosts 2 dingjunc 2 sim-west 2 imex.exe -f cputest_rb_samg_best_the1stFAST2.dat -log -fgmres[/plain]
The arguments to mpiexec should appear before the executable name, the program arguments should appear after the executable name.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools
0 Kudos
dingjun_chencmgl_ca
2,379 Views
Hi, James,

Thanks indeed.

I have another question for you.

I am testing theIntel MPI. I added the following statements in my fortran test codes before MPI initialization call:

.........................

............................

integer test1, test2, test3

test1=1

test2=2

test3=3

print *, " test1 = ",test1,

& " test2= ",test2,

& " test3= ",test3

! Initialize MPI with threading

call MPI_INIT_THREAD(required, provided, ierr)

call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )

call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )

call MPI_GET_PROCESSOR_NAME(HostName,namelen,ierr)

...............................

.................................

and also added some statements after the MPI_FINALIZE call,

..........................................................

30

call MPI_FINALIZE(rc)
print *, "after MPI_Finalize call test1 = ",test1,
& & "after MPI_Finalize call test2= ",test2,
& "after MPI_Finalize call test3= ",test3

stop

end

Allfortran codes added are NON-MPI codes before MPI_INIT_THREADING() calland after MPI_FINALIZE() call . But the following outcomes are surprise to me and they are stillexecuted under MPIcase. Please the see the following output after I enter the following commend:

mpiexec -hosts 2 dingjunc 12 sim-west 12 fpi_mpionly.exe

(**Note thatboth dingjunc and sim-west have 12 cores)

outputs are as follows:


test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3
test1 = 1 test2= 2 test3= 3

PI Process 15 of 24 is alive on sim-west.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 21 of 24 is alive on sim-west.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 12 of 24 is alive on sim-west.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 13 of 24 is alive on sim-west.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 20 of 24 is alive on sim-west.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 23 of 24 is alive on sim-west.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 19 of 24 is alive on sim-west.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 18 of 24 is alive on sim-west.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 17 of 24 is alive on sim-west.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 22 of 24 is alive on sim-west.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 16 of 24 is alive on sim-west.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 14 of 24 is alive on sim-west.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 9 of 24 is alive on dingjunc.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 2 of 24 is alive on dingjunc.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 8 of 24 is alive on dingjunc.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 1 of 24 is alive on dingjunc.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 6 of 24 is alive on dingjunc.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 4 of 24 is alive on dingjunc.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 10 of 24 is alive on dingjunc.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 0 of 24 is alive on dingjunc.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 3 of 24 is alive on dingjunc.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 11 of 24 is alive on dingjunc.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 5 of 24 is alive on dingjunc.cgy.cmgl.ca running Intel MPI version 2.1
PI Process 7 of 24 is alive on dingjunc.cgy.cmgl.ca running Intel MPI version 2.1
rocess 0 of 24 finalResult = 2400000.10 Wallclock time = 1.105
pi is approximately: 3.1415926535898273 Error is: 0.0000000000000342


after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3
after MPI_Finalize call test1 = 1 after MPI_Finalize call test2=
2 after MPI_Finalize call test3= 3


Why can NON-MPI fortran codes runbefore MPI_INIT_THREADING() call and after MPI_FINALIZE() call?

I look forward to hearing from you.Many thanks to you.

Dingjun



0 Kudos
James_T_Intel
Moderator
2,379 Views
Hi Dingjun,

Everything looks fine to me. You are allowed to use non-MPI calls (as well as MPI_GET_VERSION, MPI_INITIALIZED, and MPI_FINALIZED) freely outside of the MPI region. Can you please clarify where you see a problem?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools
0 Kudos
dingjun_chencmgl_ca
2,379 Views


Hi, James,


Thanks again.I think the following codes should not be excuted with MPI becasue they are outside of the MPI code regeion. In general, MPI code regeiononly includes codes between MPI_INIT_THREADING calland MPI_FINALIZE call, and those codes fallinginto MPI regeion can be executed with MPI multiple processes . I am surprise to see why the following codes could be excuted in 24 MPI processes before calling MPI_INIT_THREADING().


integer

test1, test2, test3

test1=1

test2=2

test3=3

print *, " test1 = ",test1,& " test2= ",test2,& " test3= ",test3


Your reply is highly appreciated.

Dingjun

0 Kudos
James_T_Intel
Moderator
2,379 Views
Hi Dingjun,

Ok, I understand your question better now. Here's what is happening. When you use mpiexec, you are launching a set of programs that are linked together in a common environment. In your case, you have 24 copies of your program running simultaneously within one MPI environment. At launch, each program has no knowledge of that environment, it simply begins running as normal. When you call MPI_INIT or MPI_INIT_THREAD each copy of the program will initialize its information about the MPI environment. This will enable it to communicate with the other programs in the environment. Any MPI calls (other than the exceptions I previously mentioned) made before this will fail.

At this point, the programs can communicate via MPI calls with each other, query the MPI environment for more information, etc. Once the need for the environment is finished, MPI_FINALIZE will ensure that all MPI calls are completed. Other calls can still be made, as the program is still running until it finishes.

Try running the following command:

[plain]mpiexec -n 2 notepad.exe[/plain]
You'll notice that two copies of notepad.exe are opened. Since this is not an MPI program, each copy will run independently. However, mpiexec will not finish until both copies of notepad are closed, as they are both being run under mpiexec. Using mpiexec does not restrict a program to only running what is between MPI_INIT and MPI_FINALIZE.

Does that help to clarify the program behavior?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools
0 Kudos
dingjun_chencmgl_ca
2,379 Views
Yes. Thank you very much!

Dingjun
0 Kudos
Reply