- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am facing performance issues with the function dgesvd when running in 64bit with AVX2 (MKL_CBWR=AVX2)
For some sizes of matrix the SVD duration is 25 times longer in 64bit than in 32bit !
You may reproduce with the test in attachment. On my side I get thoses durations for 1 svd on an mXn matrix:
- 101x63 : 32bit = 2ms, 64bit = 1.4ms;
- 101x64 : 32bit = 2ms, 64bit = 20ms;
- 102x64 : 32bit = 2ms, 64bit = 1.4ms;
- 103x103 : 32bit = 4ms, 64bit = 100ms;
There is no problem with MKL_CBWR=AVX.
Could you please have a look ?
My configuration:
- Composer 2019 update 4 (same behaviour with 2018 up4)
- BasePlatformToolSet : vc12
- Win 10 Enterprise 64bit
- CPU: i7-6820HQ
Regards,
Guillaume A.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
The issue has been fixed in the latest MKL release (2020) :)
Thanks,
Guillaume A.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
is that threaded mode? could you add verbose mode output for 103x103 case?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Gennady,
Indeed I forgot to precise : I am using the sequential mode.
Here are the ouputs with MKL_VERBOSE=1 for one svd:
64bit:
MKL_VERBOSE Intel(R) MKL 2019.0 Update 4 Product build 20190411 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 2 (Intel(R) AVX2) enabled processors, Win 2.70GHz cdecl sequential MKL_VERBOSE DGESVD(A,A,103,103,0000000000FE5E40,103,0000000000FFAA40,0000000000FFAE00,103,000000000100FA00,103,0000000000CFF520,-1,0) 147.89us CNR:AVX2 Dyn:1 FastMM:1 TID:0 NThr:1 MKL_VERBOSE DGESVD(A,A,103,103,0000000001055300,103,0000000000FFAA40,0000000001069F80,103,000000000107EB80,103,000000000104E200,3605,0) 112.26ms CNR:AVX2 Dyn:1 FastMM:1 TID:0 NThr:1
32bit:
MKL_VERBOSE Intel(R) MKL 2019.0 Update 4 Product build 20190411 for 32-bit Intel(R) Advanced Vector Extensions 2 (Intel(R) AVX2) enabled processors, Win 2.70GHz sequential MKL_VERBOSE DGESVD(A,A,103,103,01045D40,103,0105A900,0105ACC0,103,0106F8C0,103,004FF604,-1,0) 116.49us CNR:AVX2 Dyn:1 FastMM:1 TID:0 NThr:1 MKL_VERBOSE DGESVD(A,A,103,103,010B5180,103,0105A900,010C9D80,103,010DE980,103,010AE000,3605,0) 4.64ms CNR:AVX2 Dyn:1 FastMM:1 TID:0 NThr:1
Regards,
Guillaume A.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yes, I see ~ the same performance problem when linking with mkl_sequential lib. The gap is about 15 times for this specific problem sizes.
32 bit : [ PERF --> ] 0.004 clock for 1 iteration
64 bit : [ PERF --> ] 0.062 clock for 1 iteration
the Ratio is ~ 15 times
but there is no problem when linking with the threaded version of MKL ( 2019.4)
In the case, if the optimization for this specific problem sizes and ia32 version of MKL is important to you, could you please submit the request to the intel online service center to further communication internally.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
intel online service center - https://supporttickets.intel.com/?lang=en-US
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, thanks. Here is the ticket : 04232883
Please note that this behaviour may be observed for many other sizes of matrices: 160x160, 200x200, 302x302, ...
I add in attachment an Excel file containing the comparison 32 vs 64 bit of the svd duration for matrices extracted from real use-cases of my production.
My point of view: There is a performance issue in 64bit and AVX2 for the svd. We do not need any problem sized specials optimizations. We just need to have as good performances in 64bit as in 32bit, never mind the size of the matrix ;)
Regards,
Guillaume A.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
The issue has been fixed in the latest MKL release (2020) :)
Thanks,
Guillaume A.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thanks Guillaume!
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page