- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In short, calling Py_Initialize() within an MPI program while using the Intel MPI 2019 or later runtime on an AMD machine results in the error "Python error: <stdin> is a directory, cannot continue". This does not occur when using Intel MPI 2018.3 on AMD, or when using an Intel machine with IMPI 2019.9. It does not matter where the executables were build (AMD or Intel), the behavior is the same at runtime.
The attached proof of concept contains a C source file with a simple MPI initialization, hello world, Python initialization, and printing of the system time from Python. It also contains a sequential version.
"bash build_poc.gsm" will build a sequential and MPI version of the POC.
"bash run_poc.gsm" will run the sequential version, the MPI version with the 2018.3 runtime, and the MPI version again with the 2019.9 runtime.
To run these scripts, please update your environment to point to your local compiler, MPI runtime, and Python installation. You may also need to update the compilation/linking flags based on the output of your local "python3-config --cflags" and "python3-config --ldflags".
We used an EPYC machine with Linux kernel version 2.6.32-754.el6.x86_64 and a Skylake-X machine with Linux kernel version 3.10.0-1062.el7.x86_64. Our colleagues have been able to reproduce this issue on AMD machines with EL7, other Linux distributions entirely, and newer EPYC chips than our test machine, so this does not seem specific to one version of Linux, or a specific generation of hardware.
Here is the output from the Intel machine:
[gsm@bruser018 POC]$ bash run_poc.gsm Running sequential Today is Tue May 11 23:03:18 2021 Running MPI 2018.3 Hello world from processor bruser018.esi-internal.esi-group.com, rank 0 out of 2 processors Hello world from processor bruser018.esi-internal.esi-group.com, rank 1 out of 2 processors Today is Tue May 11 23:03:19 2021 Today is Tue May 11 23:03:19 2021 Running MPI 2019.9 Hello world from processor bruser018.esi-internal.esi-group.com, rank 0 out of 2 processors Hello world from processor bruser018.esi-internal.esi-group.com, rank 1 out of 2 processors Today is Tue May 11 23:03:20 2021 Today is Tue May 11 23:03:20 2021
Here is the output from the AMD machine:
[gsm@bruser033 POC]$ bash run_poc.gsm Running sequential Today is Tue May 11 23:03:06 2021 Running MPI 2018.3 Hello world from processor bruser033.esi-internal.esi-group.com, rank 0 out of 2 processors Hello world from processor bruser033.esi-internal.esi-group.com, rank 1 out of 2 processors Today is Tue May 11 23:03:07 2021 Today is Tue May 11 23:03:07 2021 Running MPI 2019.9 Hello world from processor bruser033.esi-internal.esi-group.com, rank 1 out of 2 processors Hello world from processor bruser033.esi-internal.esi-group.com, rank 0 out of 2 processors Python error: <stdin> is a directory, cannot continue =================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = RANK 0 PID 63349 RUNNING AT bruser033.esi-internal.esi-group.com = KILLED BY SIGNAL: 9 (Killed) ===================================================================================
As you can see, all the tests succeed on Intel. The sequential and IMPI 2018.3 tests succeed on AMD, but the 2019.9 test fails.
We would greatly appreciate knowledge of any potential workarounds for this issue. Thanks!
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for providing the source code. We have tested the code on the Intel platform it worked fine. We are working on the issue and will get back to you soon.
Thanks & Regards
Shivani
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We were able to find a workaround: redirecting stdin to all procs by passing "-s all" to mpirun.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your detailed report. It is great that you found a workaround. The engineering team is working on how to address this issue in more systematic way.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Our engineering team confirmed that this is a known issue. Unfortunately, the fix is missing in 2021.2 and all 2019 versions but 2021.3 should include fix.
The current workaround is still ‘-s all’. It is not related to non-Intel cpus, but to gcc. it is assumed that you used icc on IA.
Our engineering team saw such behavior on IA with gfortran. The issue is that mpiexec closes fd 0 by default for all ranks but rank 0 -- and gcc runtime does not always correctly handle this.
‘-s all’ leads to fd 0 remaining open, even we do not need it. IMPI2021.3 will have a fix - it won't close fd 0 by default.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Did we answer your question? Anything else can we help you with?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
This question will no longer be handled by Intel support due to inactivity,

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page