Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

f2py-wrapped Python extensions hang on import when compiled using -ax flags

brelje__ben
Beginner
682 Views

Operating system and version - centos:7.6.1810 docker image, also Centos 7-based HPC system

Compiler version - 2018.2

CPU: Intel® Core™ i7-6700K CPU @ 4.00GHz × 8, also Haswell and Skylake nodes on TACC Stampede2 cluster

I am trying to build a high-performance Docker container to run a scientific computing workload using Singulairty on both AVX2 capable desktop machines and AVX512 HPC machines (such as the machine I mainly use, TACC's Stampede2).

My lab's ADflow flow solver (https://github.com/mdolab/adflow), which is a f2py-wrapped Fortran extension to Python. I have a Dockerfile which successfully builds and passes regression tests using Intel 2018.2 compilers. In order to add the option to use AVX512 instructions, I followed the recommended instructions at https://portal.tacc.utexas.edu/user-guides/stampede2 to build a fat binary, using the following flags in ifort and icc:

-xCORE-AVX2 -axCORE-AVX512

First, I tried adding the above -x and -ax flags to the existing compiler flags in our makefile: "-fPIC -r8 -O2 -g". Compiling and linking the program works as expected, but when I run the program, it hangs as soon as the shared object (libadflow.so) library is imported into Python. GDB shows the following trace (gdb "where"):

#0  0x00007fffcc801e70 in __intel_cpu_features_init_body () from /opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin/libirc.so
#1  0x00007fffcc801e2b in __intel_cpu_features_init () from /opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin/libirc.so
#2  0x00007fffc7fa2a0a in fortran_setattr () from /tmp/tmp0_y01fj8/libadflow.so
#3  0x00007fffd6d79240 in ?? ()
#4  0x000055555563c057 in PyObject_SetAttr (v=0x7fffd6efd3c0, name=<optimized out>, value=0x7fffd581c590) at /tmp/build/80754af9/python_1565725737370/work/Objects/object.c:1024
#5  0x00005555557201ba in _PyEval_EvalFrameDefault () at /tmp/build/80754af9/python_1565725737370/work/Python/ceval.c:2015
#6  0x0000555555668539 in _PyEval_EvalCodeWithName () at /tmp/build/80754af9/python_1565725737370/work/Python/ceval.c:3930
#7  0x0000555555669860 in _PyFunction_FastCallDict () at /tmp/build/80754af9/python_1565725737370/work/Objects/call.c:376
#8  0x0000555555687e53 in _PyObject_Call_Prepend () at /tmp/build/80754af9/python_1565725737370/work/Objects/call.c:908
#9  0x00005555556bf97a in slot_tp_init () at /tmp/build/80754af9/python_1565725737370/work/Objects/typeobject.c:6636
#10 0x00005555556c0588 in type_call (kwds=0x7fffd6f11b40, args=0x7ffff7e20050, type=<optimized out>) at /tmp/build/80754af9/python_1565725737370/work/Objects/typeobject.c:971
#11 _PyObject_FastCallKeywords () at /tmp/build/80754af9/python_1565725737370/work/Objects/call.c:199
#12 0x0000555555724a8f in call_function (kwnames=0x7ffff6d97550, oparg=<optimized out>, pp_stack=<synthetic pointer>) at /tmp/build/80754af9/python_1565725737370/work/Python/ceval.c:4619
#13 _PyEval_EvalFrameDefault () at /tmp/build/80754af9/python_1565725737370/work/Python/ceval.c:3139
#14 0x0000555555668539 in _PyEval_EvalCodeWithName () at /tmp/build/80754af9/python_1565725737370/work/Python/ceval.c:3930
#15 0x0000555555669424 in PyEval_EvalCodeEx () at /tmp/build/80754af9/python_1565725737370/work/Python/ceval.c:3959
#16 0x000055555566944c in PyEval_EvalCode (co=<optimized out>, globals=<optimized out>, locals=<optimized out>) at /tmp/build/80754af9/python_1565725737370/work/Python/ceval.c:524
#17 0x000055555577eb74 in run_mod () at /tmp/build/80754af9/python_1565725737370/work/Python/pythonrun.c:1035
#18 0x0000555555788eb1 in PyRun_FileExFlags () at /tmp/build/80754af9/python_1565725737370/work/Python/pythonrun.c:988
#19 0x00005555557890a3 in PyRun_SimpleFileExFlags () at /tmp/build/80754af9/python_1565725737370/work/Python/pythonrun.c:429
#20 0x000055555578a195 in pymain_run_file (p_cf=0x7fffffffd340, filename=0x5555558c03c0 L"runScript.py", fp=0x5555558e3fd0) at /tmp/build/80754af9/python_1565725737370/work/Modules/main.c:433
#21 pymain_run_filename (cf=0x7fffffffd340, pymain=0x7fffffffd450) at /tmp/build/80754af9/python_1565725737370/work/Modules/main.c:1612
#22 pymain_run_python (pymain=0x7fffffffd450) at /tmp/build/80754af9/python_1565725737370/work/Modules/main.c:2873
#23 pymain_main () at /tmp/build/80754af9/python_1565725737370/work/Modules/main.c:3413
#24 0x000055555578a2bc in _Py_UnixMain () at /tmp/build/80754af9/python_1565725737370/work/Modules/main.c:3448
#25 0x00007ffff7813505 in __libc_start_main () from /lib64/libc.so.6
#26 0x000055555572f062 in _start () at ../sysdeps/x86_64/elf/start.S:103

Removing the -ax flag, cleaning and recompiling the program results in no hang (program works as expected).

I also tried backing up a step or two and compiling numpy 1.16.1 from source with fat binaries. I edited the numpy distutils scripts to include the following new compiler flags for icc and ifort: -xCORE-AVX2 -axCORE-AVX512 and built numpy. Numpy imports into Python w/o issue (when run by itself) and passes its tests. However, when I try to import my f2py wrapped extension (compiled WITHOUT -ax flags), it hangs again. Tracing in gdb, it's hanging on the same function but in a different Intel library:

(gdb) where
#0  0x00007fffea0a1421 in __intel_cpu_features_init_body () from /opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin/libintlc.so.5
#1  0x00007fffea0a126b in __intel_cpu_features_init () from /opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin/libintlc.so.5
#2  0x00007fffe768da0c in PyUFunc_FromFuncAndDataAndSignatureAndIdentity () from /usr/lib64/python2.7/site-packages/numpy-1.16.1-py2.7-linux-x86_64.egg/numpy/core/_multiarray_umath.so
#3  0x00007fffe7b4c508 in trunc_functions () from /usr/lib64/python2.7/site-packages/numpy-1.16.1-py2.7-linux-x86_64.egg/numpy/core/_multiarray_umath.so
#4  0x00007fffe740bb50 in PyUFunc_O_O () from /usr/lib64/python2.7/site-packages/numpy-1.16.1-py2.7-linux-x86_64.egg/numpy/core/_multiarray_umath.so
#5  0x00007fffe740b270 in PyUFunc_dd_d () from /usr/lib64/python2.7/site-packages/numpy-1.16.1-py2.7-linux-x86_64.egg/numpy/core/_multiarray_umath.so
#6  0x00000000ffffffff in ?? ()
#7  0x00007fffe737ffb0 in nc_tanl () from /usr/lib64/python2.7/site-packages/numpy-1.16.1-py2.7-linux-x86_64.egg/numpy/core/_multiarray_umath.so
#8  0x00007fffe73970d3 in InitOperators () from /usr/lib64/python2.7/site-packages/numpy-1.16.1-py2.7-linux-x86_64.egg/numpy/core/_multiarray_umath.so
#9  0x00007fffe739371e in init_multiarray_umath () from /usr/lib64/python2.7/site-packages/numpy-1.16.1-py2.7-linux-x86_64.egg/numpy/core/_multiarray_umath.so
#10 0x00007ffff7b08db9 in _PyImport_LoadDynamicModule () from /lib64/libpython2.7.so.1.0
#11 0x00007ffff7b06e91 in import_submodule () from /lib64/libpython2.7.so.1.0
#12 0x00007ffff7b070dd in load_next () from /lib64/libpython2.7.so.1.0
#13 0x00007ffff7b07af8 in PyImport_ImportModuleLevel () from /lib64/libpython2.7.so.1.0
#14 0x00007ffff7aead6f in builtin___import__ () from /lib64/libpython2.7.so.1.0
#15 0x00007ffff7a5aab3 in PyObject_Call () from /lib64/libpython2.7.so.1.0
#16 0x00007ffff7aec947 in PyEval_CallObjectWithKeywords () from /lib64/libpython2.7.so.1.0
#17 0x00007ffff7af1605 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#18 0x00007ffff7af608d in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#19 0x00007ffff7af6192 in PyEval_EvalCode () from /lib64/libpython2.7.so.1.0
#20 0x00007ffff7b05f7c in PyImport_ExecCodeModuleEx () from /lib64/libpython2.7.so.1.0
#21 0x00007ffff7b061f8 in load_source_module () from /lib64/libpython2.7.so.1.0
#22 0x00007ffff7b06e91 in import_submodule () from /lib64/libpython2.7.so.1.0
#23 0x00007ffff7b0738f in ensure_fromlist () from /lib64/libpython2.7.so.1.0
#24 0x00007ffff7b07bca in PyImport_ImportModuleLevel () from /lib64/libpython2.7.so.1.0
#25 0x00007ffff7aead6f in builtin___import__ () from /lib64/libpython2.7.so.1.0
#26 0x00007ffff7a5aab3 in PyObject_Call () from /lib64/libpython2.7.so.1.0
#27 0x00007ffff7aec947 in PyEval_CallObjectWithKeywords () from /lib64/libpython2.7.so.1.0
#28 0x00007ffff7af1605 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
---Type <return> to continue, or q <return> to quit---
#29 0x00007ffff7af608d in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#30 0x00007ffff7af6192 in PyEval_EvalCode () from /lib64/libpython2.7.so.1.0
#31 0x00007ffff7b05f7c in PyImport_ExecCodeModuleEx () from /lib64/libpython2.7.so.1.0
#32 0x00007ffff7b061f8 in load_source_module () from /lib64/libpython2.7.so.1.0
#33 0x00007ffff7b06e91 in import_submodule () from /lib64/libpython2.7.so.1.0
#34 0x00007ffff7b0738f in ensure_fromlist () from /lib64/libpython2.7.so.1.0
#35 0x00007ffff7b07bca in PyImport_ImportModuleLevel () from /lib64/libpython2.7.so.1.0
#36 0x00007ffff7aead6f in builtin___import__ () from /lib64/libpython2.7.so.1.0
#37 0x00007ffff7a5aab3 in PyObject_Call () from /lib64/libpython2.7.so.1.0
#38 0x00007ffff7aec947 in PyEval_CallObjectWithKeywords () from /lib64/libpython2.7.so.1.0
#39 0x00007ffff7af1605 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#40 0x00007ffff7af608d in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#41 0x00007ffff7af6192 in PyEval_EvalCode () from /lib64/libpython2.7.so.1.0
#42 0x00007ffff7b05f7c in PyImport_ExecCodeModuleEx () from /lib64/libpython2.7.so.1.0
#43 0x00007ffff7b061f8 in load_source_module () from /lib64/libpython2.7.so.1.0
#44 0x00007ffff7b0768a in load_package () from /lib64/libpython2.7.so.1.0
#45 0x00007ffff7b06e91 in import_submodule () from /lib64/libpython2.7.so.1.0
#46 0x00007ffff7b0738f in ensure_fromlist () from /lib64/libpython2.7.so.1.0
#47 0x00007ffff7b07bca in PyImport_ImportModuleLevel () from /lib64/libpython2.7.so.1.0
#48 0x00007ffff7aead6f in builtin___import__ () from /lib64/libpython2.7.so.1.0
#49 0x00007ffff7a5aab3 in PyObject_Call () from /lib64/libpython2.7.so.1.0
#50 0x00007ffff7aec947 in PyEval_CallObjectWithKeywords () from /lib64/libpython2.7.so.1.0
#51 0x00007ffff7af1605 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#52 0x00007ffff7af608d in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#53 0x00007ffff7af6192 in PyEval_EvalCode () from /lib64/libpython2.7.so.1.0
#54 0x00007ffff7b05f7c in PyImport_ExecCodeModuleEx () from /lib64/libpython2.7.so.1.0
#55 0x00007ffff7b061f8 in load_source_module () from /lib64/libpython2.7.so.1.0
#56 0x00007ffff7b0768a in load_package () from /lib64/libpython2.7.so.1.0
#57 0x00007ffff7b06e91 in import_submodule () from /lib64/libpython2.7.so.1.0
---Type <return> to continue, or q <return> to quit---
#58 0x00007ffff7b070dd in load_next () from /lib64/libpython2.7.so.1.0
#59 0x00007ffff7b07abe in PyImport_ImportModuleLevel () from /lib64/libpython2.7.so.1.0
#60 0x00007ffff7aead6f in builtin___import__ () from /lib64/libpython2.7.so.1.0
#61 0x00007ffff7a5aab3 in PyObject_Call () from /lib64/libpython2.7.so.1.0
#62 0x00007ffff7a5ab95 in call_function_tail () from /lib64/libpython2.7.so.1.0
#63 0x00007ffff7a5ac7e in PyObject_CallFunction () from /lib64/libpython2.7.so.1.0
#64 0x00007ffff7b08562 in PyImport_Import () from /lib64/libpython2.7.so.1.0
#65 0x00007ffff7b086da in PyImport_ImportModule () from /lib64/libpython2.7.so.1.0
#66 0x00007ffff6900a5f in initidwarp () from /home/mdolabuser/repos/idwarp/src/f2py/idwarp.so
#67 0x00007ffff7b08db9 in _PyImport_LoadDynamicModule () from /lib64/libpython2.7.so.1.0
#68 0x00007ffff7b06e91 in import_submodule () from /lib64/libpython2.7.so.1.0
#69 0x00007ffff7b070dd in load_next () from /lib64/libpython2.7.so.1.0
#70 0x00007ffff7b07abe in PyImport_ImportModuleLevel () from /lib64/libpython2.7.so.1.0
#71 0x00007ffff7aead6f in builtin___import__ () from /lib64/libpython2.7.so.1.0
#72 0x00007ffff7a5aab3 in PyObject_Call () from /lib64/libpython2.7.so.1.0
#73 0x00007ffff7aec947 in PyEval_CallObjectWithKeywords () from /lib64/libpython2.7.so.1.0
#74 0x00007ffff7af1605 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#75 0x00007ffff7af608d in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#76 0x00007ffff7af6192 in PyEval_EvalCode () from /lib64/libpython2.7.so.1.0
#77 0x00007ffff7b0f5cf in run_mod () from /lib64/libpython2.7.so.1.0
#78 0x00007ffff7b10445 in PyRun_StringFlags () from /lib64/libpython2.7.so.1.0
#79 0x00007ffff7aef685 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#80 0x00007ffff7af608d in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#81 0x00007ffff7af6192 in PyEval_EvalCode () from /lib64/libpython2.7.so.1.0
#82 0x00007ffff7b0f5cf in run_mod () from /lib64/libpython2.7.so.1.0
#83 0x00007ffff7b1079e in PyRun_FileExFlags () from /lib64/libpython2.7.so.1.0
#84 0x00007ffff7b11a29 in PyRun_SimpleFileExFlags () from /lib64/libpython2.7.so.1.0
#85 0x00007ffff7b22bdf in Py_Main () from /lib64/libpython2.7.so.1.0
#86 0x00007ffff6d3e505 in __libc_start_main () from /lib64/libc.so.6

I have tried different compiler levels (2019.5) with the same result. I have tried the system Python, Intel Python distribution, and Anaconda Python all with the same result. The *only* time I have been successful at doing this is using TACC Stampede2's home-grown Python and Numpy distribution on their cluster. I tried using a different Python and Numpy (Anaconda Python, conda numpy with mkl) on a Stampede skylake node, bare metal, and it hangs.

Any ideas?

Thanks,

Ben Brelje

University of Michigan MDOLab

0 Kudos
1 Reply
brelje__ben
Beginner
682 Views

This is the GDB trace from a  hung run on Stampede2 (bare metal). I am using Anaconda python instead of the system python for this run. The system python works fine. It is not a Docker-specific issue.

#0  0x00002aaacfa15e70 in __intel_cpu_features_init_body ()
   from /opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin/libirc.so
#1  0x00002aaacfa15e2b in __intel_cpu_features_init ()
   from /opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin/libirc.so
#2  0x00002aaad16905aa in fortran_setattr () from /tmp/tmpVKFd0I/libadflow.so
#3  0x00002aaacae86674 in ?? ()
#4  0x00002aaaaab591c2 in PyObject_SetAttr () from /home1/06381/tg856804/miniconda/bin/../lib/libpython2.7.so.1.0
#5  0x00002aaaaabb2de1 in PyEval_EvalFrameEx () from /home1/06381/tg856804/miniconda/bin/../lib/libpython2.7.so.1.0
#6  0x00002aaaaabb8a99 in PyEval_EvalCodeEx () from /home1/06381/tg856804/miniconda/bin/../lib/libpython2.7.so.1.0
#7  0x00002aaaaab417c7 in function_call () from /home1/06381/tg856804/miniconda/bin/../lib/libpython2.7.so.1.0
#8  0x00002aaaaab1cb73 in PyObject_Call () from /home1/06381/tg856804/miniconda/bin/../lib/libpython2.7.so.1.0
 

0 Kudos
Reply