I am trying to understand how a Coarray Fortran DLL can be possibly called from Python. Consider the following sample Fortran module file `example_mod.f90` which is to be called from Python later:
module example_mod use iso_c_binding implicit none #ifdef COARRAY_ENABLED integer :: co_int
with the subroutine's implementation given in the submodule file `example_mod@sub_smod.f90` :
submodule (example_mod) sub_smod implicit none contains module procedure sqr_2d_arr use mpi integer :: rank, size, ierr integer :: i, j call MPI_Comm_size(comm, size, ierr) call MPI_Comm_rank(comm, rank, ierr) write(*,"(*(g0,:,' '))") "Hello from Fortran MPI! I am process", rank, "of", size, ', comm:', comm write(*,"(*(g0,:,' '))") "Hello from Fortran COARRAY! I am image ", this_image(), " out of", num_images(), "images." sync all do j = 1, nd do i = 1, nd val(i, j) = (val(i, j) + val(j, i)) ** 2 enddo enddo end procedure sqr_2d_arr end submodule sub_smod
The subroutine also contains calls to MPI library for the sake of comparison with Coarray. I compile this code with the following ifort flags:
mpiifort /Qcoarray=distributed /Od /debug:full /fpp -c example_mod.f90 mpiifort /Qcoarray=distributed /Od /debug:full /fpp -c example_mod@sub_smod.f90 mpiifort /Qcoarray=distributed /Od /debug:full /fpp /dll /libs:dll /threads example_mod.obj example_mod@sub_smod.obj
Now, I have the following Python2 script which calls the generated DLL above:
#!/usr/bin/env python from __future__ import print_function from mpi4py import MPI comm = MPI.COMM_WORLD fcomm = MPI.COMM_WORLD.py2f() print("Hello from Python! I'm rank %d from %d running in total..." % (comm.rank, comm.size)) comm.Barrier() # wait for everybody to synchronize _here_ ###################### import ctypes as ct import numpy as np # import the dll fortlib = ct.CDLL('example_mod.dll') # setup the data N = 2 nd = ct.pointer( ct.c_int(N) ) # setup the pointer pyarr = np.arange(0, N, dtype=int) * 5 # setup the N-long for i in range(1, N): # concatenate columns until it is N x N pyarr = np.c_[pyarr, np.arange(0, N, dtype=int) * 5] # call the function by passing the ctypes pointer using the numpy function: fcomm_pt = ct.pointer( ct.c_int(fcomm) ) _ = fortlib.sqr_2d_arr(nd, np.ctypeslib.as_ctypes(pyarr),fcomm_pt) print(pyarr)
Running this script with the following command:
mpiexec -np 4 python main.py
yields this output:
Hello from Fortran MPI! I am process 1 of 4 , comm: 1140850688 Hello from Fortran MPI! I am process 3 of 4 , comm: 1140850688 Hello from Fortran COARRAY! I am image 1 out of 0 images. Hello from Fortran MPI! I am process 0 of 4 , comm: 1140850688 Hello from Fortran COARRAY! I am image 1 out of 0 images. Hello from Fortran MPI! I am process 2 of 4 , comm: 1140850688 Hello from Fortran COARRAY! I am image 1 out of 0 images. Hello from Fortran COARRAY! I am image 1 out of 0 images. Hello from Python! I'm rank 3 from 4 running in total... [[ 0 25] [900 100]] Hello from Python! I'm rank 0 from 4 running in total... [[ 0 25] [900 100]] Hello from Python! I'm rank 1 from 4 running in total... [[ 0 25] [900 100]] Hello from Python! I'm rank 2 from 4 running in total... [[ 0 25] [900 100]]
The computations performed in this set of codes is not important or relevant to the discussion here. However, I cannot understand why the MPI ranks are properly output, while the Coarray num_images() is zero for all processes. As a broader question, what is the best strategy to write a Coarray Fortran application that can be called from other languages such as Python?
I strongly suspect that a "coarray DLL" is not workable with the Intel implementation of coarrays.
During the startup of a coarray program (i.e. something compiled from source with a PROGRAM statement) various library routines are invoked to set up the environment for the subsequent multi-image execution. That set up won't occur if you are just invoking procedures compiled into a DLL.
Compile your code as a program proper, and have python invoke that program as a separate process.
Try adding once-only code to your dll that calls the intel for_rtl_init_ function on the dll load, and for_rtl_finish_ on the dll unload.
Note, these calls may be required when the main program is .NOT. a Fortran PROGRAM.
Ian, thanks. That would be a viable option. However, my Fortran application has a Python callback. I found out that there is an application "forpy" that let's you call Python from within Fortran. But that does not work for me because apparently, upon Python call from Fortran, FORPY initializes a new instance of Python, which is likely independent of the original main Python environment. If you know of any Fortran/Python callback method I'd appreciate sharing it with me here.
Jim, thanks. Your and Ian's comments are always very helpful on this forum. It seems like "for_rtl_init_" is a function that can be called from C main file:
But I do not know where I should call this routine, inside the fortran DLL or outside in the python interpreter? Simply calling it from inside the exported subroutine does not work as ifort gives the following error:
mpiifort /Qcoarray=distributed /Od /debug:full /fpp /dll /libs:dll /threads example_mod.obj example_mod@sub_smod.obj mpifc.bat for the Intel(R) MPI Library 2019 for Windows* Copyright 2007-2018 Intel Corporation. Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 184.108.40.206 Build 20180804 Copyright (C) 1985-2018 Intel Corporation. All rights reserved. Microsoft (R) Incremental Linker Version 14.15.26732.1 Copyright (C) Microsoft Corporation. All rights reserved. -out:example_mod.dll -debug -pdb:example_mod.pdb -dll -implib:example_mod.lib "/LIBPATH:C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mpi\intel64\bin\..\..\intel64\lib\debug" "/LIBPATH:C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mpi\intel64\bin\..\..\intel64\lib" impi.lib example_mod.obj example_mod@sub_smod.obj Creating library example_mod.lib and object example_mod.exp example_mod@sub_smod.obj : error LNK2019: unresolved external symbol FOR_RTL_INIT referenced in function sqr_2d_arr example_mod.dll : fatal error LNK1120: 1 unresolved externals ERROR in the compiling/linking 
Do you have any suggestions on how to call it from within the submodule's subroutine?
The problem is that coarray program initialization happens within the main program, including the "mpirun". I don't think the implementation is yet up to having coarray code in a DLL (without a Fortran main program.) FOR_RTL_INIT isn't going to help here. At a minimum you'd need to use /Qcoarray:single and use a separate MPI command to start the image (Python program) across the desired number of processes.
I see there is a routine for_rtl_ICAF_COINIT which obviously does some sort of initialization, but it is not documented for users calling it directly.
Thanks Steve. The /Qcoarray flag does not seems to have any effects on the output. I tried SINGLE, SHARED and DISTRIBUTED. The fortran processes are all generated properly, and so long as there is no message passing between the images, it will run fine. But that severly limits the usability of coarrays in mixed language programming (basically there is no coarray parallelization under this scenario). A large body of the software's users live in other language islands, among them Python. It would be good if the Intel team could come up with a solution or a set of guildelines about coarray mixed language programming, perhaps as a blog post. Anyways, we look very much forward to seeing more comprehensive implementation and support of Coarray Fortran by Intel.
Does the Python program (iow the Python code) need to be distributed?
IOW does the distribution requirements only belong to the Fortran code?
If so, then consider writing a wrapper Fortran DLL that performs a SYSTEM or SYSTEMQQ to run your mpirun of an executable that performs the work currently in you DLL.
Just a thought: might an alternative be to create a Fortran main program that invokes the Python script via forpy, instead of directly via a Python interpreter program. That way the coarray environment is taken care of and there is only one Python interpreter running.
Thank you, Jim, Steve, Arjen. Your work-around solutions theoretically work, but I am afraid such an approach would severely limit the development of programs in Python. In essence, I am looking for a way to bridge Python's MPI4PY package and Intel's coarray Fortran implementation (which is implicit MPI).
I just stumbled upon this ifort option "-nofor-main", and I wonder if this could be of any help at the link time: https://software.intel.com/en-us/fortran-compiler-developer-guide-and-reference-nofor-main
I could add this flag and rerun my tests to get an answer. But I thought there could be more to this flag that I could learn from you than by simply rerunning my own tests to get an answer. Does "nofor-main" have any effects on Coarray functionality and initializations that are reuiqred and implicitly occur in the presence of a main Fortran program?
-nofor-main has no effect on Windows and will not help you.As I wrote above, coarray code currently cannot be used when the main program is not Fortran. This may be something available in the future when the Teams feature of F2018 is fully supported.