- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi there,
I've encountered two different FPE with the MKL Data Fitting library which have been driving me a little insane!
I previously mentioned the first FPE in another thread here (https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/840833), and at the time I couldn't get a small reproducer to demonstrate the problem. This FPE occurs within the construction of the Akima spline when there are multiple, repeated y-values provided for a spline. It can be consistently repeated using the attached reproducer and compile/link lines.
The second FPE occurs within the construction of the Natural spline and seems to only occur on certain machines within our cluster. I'll provide specific kernel release and CPU info for one of the machines below, but there are multiple machines that repeat the FPE. This specific FPE seems to be temporarily resolved by enabled the MKL CNR compatibility mode. Again, the attached reproducer will consistently produce this FPE on specific machines.
I compile using GCC 7.3 and use MKL 2019u5. I'm compiling on a CentOS 7 box, and running on other CentOS 7 machines.
After sourcing the Intel PSXE script 'psxevars.sh', I compile and link the reproducer using the following commands:
g++ -m64 -I${MKLROOT}/include -c CubicSpline_MKL.cc g++ -m64 -I${MKLROOT}/include -c test_MKLSpline.cc g++ -L${MKLROOT}/lib/intel64 -Wl,--no-as-needed -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -ldl CubicSpline_MKL.o test_MKLSpline.o -o test_MKLSpline
If there are any further questions, or you want some more specific info then I'm happy to help. It has taken me a while to get this reproducer, so I'm very keen to try and resolve these issues :)
Thanks,
Ewan
Details of a machine that exhibits the second FPE with the attached testcase:
- uname -a:
- Linux haggis41 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Jan 4 01:06:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
- summarised /cat/proc/cpuinfo
- processor : 23
- vendor_id : GenuineIntel
- cpu family : 6
- model : 85
- model name : Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz
- stepping : 4
- microcode : 0x200005e
- cpu MHz : 2301.000
- cache size : 16896 KB
- physical id : 1
- siblings : 12
- core id : 13
- cpu cores : 12
- apicid : 58
- initial apicid : 58
- fpu : yes
- fpu_exception : yes
- cpuid level : 22
- wp : yes
- flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 invpcid_single intel_pt spec_ctrl ibpb_support tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts
- bogomips : 4606.57
- clflush size : 64
- cache_alignment : 64
- address sizes : 46 bits physical, 48 bits virtual
- power management:
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thanks for a reproducer, Ewan. We will investigate the case.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Edward, We reproduced both of these issues and will keep this thread updated with the progress.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm not sure who Edward is, but I'm glad the reproducer works on your systems too.
Ewan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Has there been any update on this issue?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ewan, there is no update on this issue so far. Do you have the deadline, when do you expect to see this fix?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
On my end, I've missed the boat for our upcoming release. However I'm hoping I can get a fix for these issues within our next cycle.
That would mean ideally a tested fix by June/July. Is this a reasonable expectation?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, we are planning to do that but these topics are subject to change and we couldn't guarantee that the fix at the next update 2. We will keep you informed of the progress we make.
Gennady
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page