Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Jason_W_1
Beginner
189 Views

Python SciPy Compilation Errors (still) with MKL

Hello,

This has been an issue for years, are there any plans to address the ifort bug mentioned here?:

https://github.com/scipy/scipy/issues/5621

This has been an ongoing issue for a few years. The only work-around to date has been to compile with less aggressive settings (-O1 instead of -O3). At the very least, can we get an idea of the performance hit we're taking (if any) by using the -O1 flag? Thanks!

0 Kudos
5 Replies
Gennady_F_Intel
Moderator
189 Views

from the performance point of view, MKL code doesn't depend on compiler's flags you are using while compiling your program. 

TimP
Black Belt
189 Views

Nor does mkl use -O1 to set conditional numerical reproducibility.  If you have numerical issues with aggressive vectorization in icc you should consider setting options more consistent with what you may use for gcc, such as -fp-model source  

Jason_W_1
Beginner
189 Views

Gennady,

Thanks for the quick reply! That makes sense, however I have a couple of follow up questions if you would oblige me (I'm a data scientist, not a software engineer, so I'm trying to understand the full scope of what I'm dealing with here). From strictly a performance point of view, let's say we have two distinct installations of MKL + NumPy + SciPy. The first has NumPy and SciPy compiled with the -O1 optimization and the second compiled with the -O3 optimization. If we run the same program on both setups, 1) will the -O3 compiled installation compile our program more efficiently, resulting in faster performance? and/or 2) greater numeric precision?

If so, what kind of difference are we talking here for both speed and precision?

Thanks again for your help!

Jason_W_1
Beginner
189 Views

Tim,

Thank you also for your reply. I did use --fp-model strict, but I'll try out your suggestion using source instead

189 Views

Dear Jason, 

As Tim has said, it is best to determine the exact optimization causing test failures in ODR and address that specific issue directly, rather than lowering the optimization level altogether, which disables a battery of optimizations and code transformations altogether. 

Unfortunately we have not made progress in identifying this specific Intel Fortran Compiler optimization/code-transformation step yet.

However, while building SciPy for Intel (R) Distribution for Python*, we were able to use `-O3` instead of `-O1` for the entire SciPy, while lowering optimization level only for the odr module.

To see details, please download conda tar-ball of the SciPy from Intel channel, https://anaconda.org/intel/scipy/files . The archive contains info/recipe folder, which includes our patches. In particular, in scipy/odr/setup.py, we added 

diff --git a/scipy/odr/setup.py b/scipy/odr/setup.py
index 9974dfa..aad4efe 100644
--- a/scipy/odr/setup.py
+++ b/scipy/odr/setup.py
@@ -22,7 +22,7 @@ def configuration(parent_package='', top_path=None):
         libodr_files.append('d_lpkbls.f')
 
     odrpack_src = [join('odrpack', x) for x in libodr_files]
-    config.add_library('odrpack', sources=odrpack_src)
+    config.add_library('odrpack', sources=odrpack_src, extra_f77_compile_args=['-O1'])
 
     sources = ['__odrpack.c']
     libraries = ['odrpack'] + blas_info.pop('libraries', [])

while replacing `-O1` with `-O3` in NumPy's distutils for Intel (R) Fortran compiler.

This made tests pass on Linux 64-bits, but on Mac OS and on Windows further reduction of optimization level were necessary. Specifically, in scipy/sparse/linalg/isolve for extention `_iterative` to `-O1`, and in `scipy/linalg/setup.py` for extension `_fblas` to `-O2`.

While tests are not failing, while we use `-O3`, vectorization is still inhibited by use of `-fp-model strict`.

Reply