Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
공지
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
2275 토론

IMPI 2021 vs 2018: BAD TERMINATION when using std::atexit in Python extension module

BenjaminAurich
초급자
3,064 조회수

Hello,

we develop a scientific application using Python & C++ and heavily rely on MPI communication.
The current version works fine with IMPI 2018, but upgrading to OneAPI's IMPI 2021.03 we ran into issues on Windows.

We use a C++ extension module to Python, which exposes an initialization function which effectively calls MPI_Init(). This initialization function also registers a finalization function at program exit with std::atexit, which effectively calls MPI_Finalize().

Using IMPI 2018 this works correctly, but switching to IMPI 2021.03 we end up with a BAD TERMINATION exit status.

I attached the code of a minimal example that exhibits this behavior. The example can be used to build a very simple cPython extension module, which implements the module functions "initialize" and "testmpi". The "initialize" function calls MPI_Init, and registers a MPI_Finalize call
at program exit using std::atexit. The "testmpi" function needs MPI to be initialized and simply prints some MPI ranks in a Hello-World fashion. The README.md details the steps to reproduce the included logs (e.g test-impi2018.log).

I'd be glad for any hint how to mitigate this, and please let me know if some things need clarification (it's my first post after all :).

Thanks,
Benjamin

레이블 (1)
  • MPI

0 포인트
1 솔루션
James_T_Intel
중재자
2,748 조회수

Our development team has investigated and identified the issue. You have registered a dependency on MPI_Finalize in atexit. This leads to a dependency on the libfabric DLL in atexit. During the exit process, the libfabric DLL is unloaded before this call is made, which leads to the error in this case. Per Microsoft's documentation for atexit, there should not be a dependency on any DLL in atexit (see https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/atexit?view=msvc-160). As such, this is an expected scenario, and you will need to update your code to call MPI_Finalize before atexit in order to avoid this error.


I am closing the associated Intel support case with this. Any further replies on this thread will be considered community-only.


원본 게시물의 솔루션 보기

0 포인트
5 응답
SantoshY_Intel
중재자
3,033 조회수

Hi,


Thanks for reaching out to us.


We are able to reproduce your issue at our end. We are working on your issue and we will get back to you soon.


Thanks & Regards,

Santosh


James_T_Intel
중재자
2,969 조회수

I have replicated the issue as well, but I am having difficulties with how to set up one of our analysis tools to help identify the root cause. How would I modify what you have to correctly insert a library before impi.lib? I am trying to add VTmc.lib from Intel® Trace Analyzer and Collector, and when I recompile/relink with it, I get the following error at runtime:


  import impiatexit

ImportError: DLL load failed while importing impiatexit: The specified module could not be found.


0 포인트
BenjaminAurich
초급자
2,931 조회수

If you want to link other libraries with the extension module you can specify them in the setup-win-impi20201-atexit.py distutils script, I assume you did that already? 
Regarding the runtime DLL load failure, maybe it's this: If you are using a python version >= 3.7, you need to specify where python is allowed to load DLLs from using os.add_dll_directory(), or place the DLLs next to the extension module.  Python >=3.7 no longer searches %PATH% for DLLs.

I hope that helps, ask away if I can supply more info.

0 포인트
James_T_Intel
중재자
2,815 조회수

I apologize for the delayed response. I have escalated this to our development team for investigation and resolution.


0 포인트
James_T_Intel
중재자
2,749 조회수

Our development team has investigated and identified the issue. You have registered a dependency on MPI_Finalize in atexit. This leads to a dependency on the libfabric DLL in atexit. During the exit process, the libfabric DLL is unloaded before this call is made, which leads to the error in this case. Per Microsoft's documentation for atexit, there should not be a dependency on any DLL in atexit (see https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/atexit?view=msvc-160). As such, this is an expected scenario, and you will need to update your code to call MPI_Finalize before atexit in order to avoid this error.


I am closing the associated Intel support case with this. Any further replies on this thread will be considered community-only.


0 포인트
응답