Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Two different FPE with Data Fitting library

Towie__Ewan
New Contributor I
614 Views

Hi there,

I've encountered two different FPE with the MKL Data Fitting library which have been driving me a little insane!

I previously mentioned the first FPE in another thread here (https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/840833), and at the time I couldn't get a small reproducer to demonstrate the problem. This FPE occurs within the construction of the Akima spline when there are multiple, repeated y-values provided for a spline. It can be consistently repeated using the attached reproducer and compile/link lines.

The second FPE occurs within the construction of the Natural spline and seems to only occur on certain machines within our cluster. I'll provide specific kernel release and CPU info for one of the machines below, but there are multiple machines that repeat the FPE. This specific FPE seems to be temporarily resolved by enabled the MKL CNR compatibility mode. Again, the attached reproducer will consistently produce this FPE on specific machines.

I compile using GCC 7.3 and use MKL 2019u5. I'm compiling on a CentOS 7 box, and running on other CentOS 7 machines.

After sourcing the Intel PSXE script 'psxevars.sh', I compile and link the reproducer using the following commands:

g++ -m64 -I${MKLROOT}/include -c CubicSpline_MKL.cc
g++ -m64 -I${MKLROOT}/include -c test_MKLSpline.cc
g++ -L${MKLROOT}/lib/intel64 -Wl,--no-as-needed -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -ldl CubicSpline_MKL.o test_MKLSpline.o -o test_MKLSpline

If there are any further questions, or you want some more specific info then I'm happy to help. It has taken me a while to get this reproducer, so I'm very keen to try and resolve these issues :)

Thanks,

Ewan


Details of a machine that exhibits the second FPE with the attached testcase:

  •     uname -a:
    • Linux haggis41 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Jan 4 01:06:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
  •     summarised /cat/proc/cpuinfo
    •         processor       : 23
    •         vendor_id       : GenuineIntel
    •         cpu family      : 6
    •         model           : 85
    •         model name      : Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz
    •         stepping        : 4
    •         microcode       : 0x200005e
    •         cpu MHz         : 2301.000
    •         cache size      : 16896 KB
    •         physical id     : 1
    •         siblings        : 12
    •         core id         : 13
    •         cpu cores       : 12
    •         apicid          : 58
    •         initial apicid  : 58
    •         fpu             : yes
    •         fpu_exception   : yes
    •         cpuid level     : 22
    •         wp              : yes
    •         flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 invpcid_single intel_pt spec_ctrl ibpb_support tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts
    •         bogomips        : 4606.57
    •         clflush size    : 64
    •         cache_alignment : 64
    •         address sizes   : 46 bits physical, 48 bits virtual
    •         power management:
0 Kudos
7 Replies
Gennady_F_Intel
Moderator
614 Views

thanks for a reproducer, Ewan. We will investigate the case.

0 Kudos
Gennady_F_Intel
Moderator
614 Views

Edward, We reproduced both of these issues and will keep this thread updated with the progress.

0 Kudos
Towie__Ewan
New Contributor I
614 Views

I'm not sure who Edward is, but I'm glad the reproducer works on your systems too.

Ewan

0 Kudos
Towie__Ewan
New Contributor I
614 Views

Has there been any update on this issue?

0 Kudos
Gennady_F_Intel
Moderator
614 Views

Ewan, there is no update on this issue so far. Do you have the deadline, when do you expect to see this fix?

0 Kudos
Towie__Ewan
New Contributor I
614 Views

On my end, I've missed the boat for our upcoming release. However I'm hoping I can get a fix for these issues within our next cycle.

That would mean ideally a tested fix by June/July. Is this a reasonable expectation?

0 Kudos
Gennady_F_Intel
Moderator
614 Views

Yes, we are planning to do that but these topics are subject to change and we couldn't guarantee that the fix at the next update 2. We will keep you informed of the progress we make.  

Gennady

 

0 Kudos
Reply