- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
we develop a scientific application using Python & C++ and heavily rely on MPI communication.
The current version works fine with IMPI 2018, but upgrading to OneAPI's IMPI 2021.03 we ran into issues on Windows.
We use a C++ extension module to Python, which exposes an initialization function which effectively calls MPI_Init(). This initialization function also registers a finalization function at program exit with std::atexit, which effectively calls MPI_Finalize().
Using IMPI 2018 this works correctly, but switching to IMPI 2021.03 we end up with a BAD TERMINATION exit status.
I attached the code of a minimal example that exhibits this behavior. The example can be used to build a very simple cPython extension module, which implements the module functions "initialize" and "testmpi". The "initialize" function calls MPI_Init, and registers a MPI_Finalize call
at program exit using std::atexit. The "testmpi" function needs MPI to be initialized and simply prints some MPI ranks in a Hello-World fashion. The README.md details the steps to reproduce the included logs (e.g test-impi2018.log).
I'd be glad for any hint how to mitigate this, and please let me know if some things need clarification (it's my first post after all :).
Thanks,
Benjamin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Our development team has investigated and identified the issue. You have registered a dependency on MPI_Finalize in atexit. This leads to a dependency on the libfabric DLL in atexit. During the exit process, the libfabric DLL is unloaded before this call is made, which leads to the error in this case. Per Microsoft's documentation for atexit, there should not be a dependency on any DLL in atexit (see https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/atexit?view=msvc-160). As such, this is an expected scenario, and you will need to update your code to call MPI_Finalize before atexit in order to avoid this error.
I am closing the associated Intel support case with this. Any further replies on this thread will be considered community-only.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for reaching out to us.
We are able to reproduce your issue at our end. We are working on your issue and we will get back to you soon.
Thanks & Regards,
Santosh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have replicated the issue as well, but I am having difficulties with how to set up one of our analysis tools to help identify the root cause. How would I modify what you have to correctly insert a library before impi.lib? I am trying to add VTmc.lib from Intel® Trace Analyzer and Collector, and when I recompile/relink with it, I get the following error at runtime:
import impiatexit
ImportError: DLL load failed while importing impiatexit: The specified module could not be found.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you want to link other libraries with the extension module you can specify them in the setup-win-impi20201-atexit.py distutils script, I assume you did that already?
Regarding the runtime DLL load failure, maybe it's this: If you are using a python version >= 3.7, you need to specify where python is allowed to load DLLs from using os.add_dll_directory(), or place the DLLs next to the extension module. Python >=3.7 no longer searches %PATH% for DLLs.
I hope that helps, ask away if I can supply more info.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I apologize for the delayed response. I have escalated this to our development team for investigation and resolution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Our development team has investigated and identified the issue. You have registered a dependency on MPI_Finalize in atexit. This leads to a dependency on the libfabric DLL in atexit. During the exit process, the libfabric DLL is unloaded before this call is made, which leads to the error in this case. Per Microsoft's documentation for atexit, there should not be a dependency on any DLL in atexit (see https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/atexit?view=msvc-160). As such, this is an expected scenario, and you will need to update your code to call MPI_Finalize before atexit in order to avoid this error.
I am closing the associated Intel support case with this. Any further replies on this thread will be considered community-only.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page