- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a python script that spawns two instances of an app using mpi4py (MPI.COMM_SELF.Spawn_multiple()). The app is coded in Fortran. Next, I'm setting up a graph for neighborhood communication between the two spawned processes. I'm getting an access violation in the Fortran child apps on the call to MPI_Dist_graph_create().
I'm linking the Fortran app against Intel MPI and use Intel distribution for Python on Windows 10. I also tried a standard Python distribution with mpi4py built manually against the Intel MPI library--same result.
Attached a minimal example. Error message included below. This example runs fine with MSMPI.
Note that I ran in a different problem spawning the Fortran apps from python, described in a different post. I solved this by creating a symbolic link to the appropriate directory in the Python installation directory.
Thanks,
Maarten
[proxy:1:0@T0147953] main (proxy.c:954): error launching_processes
[mpiexec@T0147953] Sending Ctrl-C to processes as requested
[mpiexec@T0147953] Press Ctrl-C again to force abort
[mpiexec@T0147953] HYD_sock_write (..\windows\src\hydra_sock.c:382): write error (errno = 2)
[mpiexec@T0147953] wmain (mpiexec.c:2096): assert (exitcodes != NULL) failed
forrtl: error (200): program aborting due to control-C event
Image PC Routine Line Source
impi.dll 00007FFCB6A691D8 Unknown Unknown Unknown
KERNELBASE.dll 00007FFD32A856FD Unknown Unknown Unknown
KERNEL32.DLL 00007FFD36084034 Unknown Unknown Unknown
ntdll.dll 00007FFD363A3691 Unknown Unknown Unknown
(base) c:\intelpython3\symlink>python python_parent.py
forrtl: severe (157): Program Exception - access violation
Image PC Routine Line Source
impi.dll 00007FFCB6303A43 Unknown Unknown Unknown
impi.dll 00007FFCB62C8981 Unknown Unknown Unknown
impi.dll 00007FFCB6A194ED Unknown Unknown Unknown
fortran_child.exe 00007FF6B03A1531 MAIN__ 27 fortran_child.f90
fortran_child.exe 00007FF6B03A16C2 Unknown Unknown Unknown
fortran_child.exe 00007FF6B03A4184 Unknown Unknown Unknown
fortran_child.exe 00007FF6B03A40AE Unknown Unknown Unknown
fortran_child.exe 00007FF6B03A3F6E Unknown Unknown Unknown
fortran_child.exe 00007FF6B03A41F9 Unknown Unknown Unknown
KERNEL32.DLL 00007FFD36084034 Unknown Unknown Unknown
ntdll.dll 00007FFD363A3691 Unknown Unknown Unknown
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have received information from our development team that the internal issues are fixed for the next release, Intel® MPI Library 2021.2. Please watch for this release as part of the next update to Intel® oneAPI HPC Toolkit.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Maarten,
Could you please provide us logs after setting I_MPI_DEBUG=10
set I_MPI_DEBUG=10
Also, could you share the command you have used to create symbolic link?
Regards
Prasanth
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The debug output is given below. I omitted the Fortran stack trace since this has nothing new, as far as I can see.
I solved the problem with spawning the Fortran processes (solution posted in the thread of the post I referenced), so making a symbolic link is no longer necessary. I was a matter of passing a "path" parameter to MPI_Spawn_multiple(). The updated code of the python_shell is given below.
cheers,
Maarten
python_shell.py:
from mpi4py import MPI
import numpy as np, os
info = MPI.Info.Create()
info.Set('path', os.getcwd())
sub_comm = MPI.COMM_SELF.Spawn_multiple(['fortran_child.exe']*2,info=info)
common_comm=sub_comm.Merge(False)
topocomm = common_comm.Create_dist_graph([0],[0], np.array([],dtype=int), MPI.UNWEIGHTED)
common_comm.Disconnect()
sub_comm.Disconnect()
topocomm.Disconnect()
Debug output:
[0] MPI startup(): libfabric version: 1.7.1a1-impi
[0] MPI startup(): libfabric provider: sockets
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 22224 T0147953 {0,1,2,3,4,5,6,7}
[0] MPI startup(): I_MPI_ROOT=C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2020.2.254\windows\mpi
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_DEBUG=10
[0] MPI startup(): libfabric version: 1.7.1a1-impi
[0] MPI startup(): libfabric provider: sockets
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 25772 T0147953 {0,1,2,3}
[0] MPI startup(): 1 26700 T0147953 {4,5,6,7}
[0] MPI startup(): I_MPI_ROOT=C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2020.2.254\windows\mpi
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_DEBUG=10
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Somebody incorrectly marked my post as the solution. Presumably because I wrote that I solved the problem of another post.
@PrasanthD_intel: please not this problem is *not* yet solved.
thanks,
Maarten
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Maarten,
We tried your code and it ran perfectly in Linux but giving "access violation error" when executed on windows, just like you have reported.
We don't know the exact reason so we escalating your query to the Subject Matter Experts.
Thanks
Prasanth
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for that. I hope it gets solved soon.
Best,
Maarten
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've reproduced this internally and I have provided this to our development team for analysis to fix the issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The error you are encountering is actually the result of multiple internal issues:
- An incorrect interface is being selected on your system. This can happen for multiple reasons, including VPN. You can set FI_TCP_IFACE=eth0 to work around this issue.
- An error in path for spawned images. We are working to resolve this, there is currently no workaround.
- An error in MPI_Probe indexing. We are working to resolve this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have received information from our development team that the internal issues are fixed for the next release, Intel® MPI Library 2021.2. Please watch for this release as part of the next update to Intel® oneAPI HPC Toolkit.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Great! Hope the new version will be released soon.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This issue has been resolved and we will no longer monitor this thread. If you require additional assistance from Intel, please start a new thread.Any further interaction in this thread will be considered community only.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page