Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
Announcements
The Intel sign-in experience has changed to support enhanced security controls. If you sign in, click here for more information.
6733 Discussions

lapack function cpbtrs slower in mkl 18.0 vs 14.0

gn164
Beginner
186 Views

 

Hi,

I am experiencing a slowdown of cpbtrs function in mkl 18.0 comparing with mkl 14.0. My system is a Xeon E3-1240 v3 

The following (single-threaded) code seems to run more than 2x slower with 18.0:

         niter = 100000
         n   = 60
         nbd = 11
         ldb = 181

         allocate(a(nbd*n,niter))
         allocate(b(ldb,niter))

         a = cmplx(0.1,0.1)
         b = cmplx(0.5,0.5)

         do iter = 1 ,niter
            call CPBTRS('U', n,nbd -1, 1, a(:,iter), nbd, b(:,iter),ldb, status)
         enddo

The linking command that I used with ifort 18.0:

$INTEL_HOME/ifort -I$MKL_HOME/include/ cpbtrs.f90 
-Wl,--start-group -Wl,-Bstatic -L$MKL_HOME_LIB/lib 
-lmkl_core -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_lapack95_lp64 -liomp5 -Wl,--end-group

    

         

 

0 Kudos
7 Replies
Gennady_F_Intel
Moderator
186 Views

You link with threaded version of MKL 2018. Do you set MKL_NUM_THREADS=1 to run this test into single-threaded mode?

Is that lin or Windows?  wrt mkl 14 - there is no such verions - could you loot in mkl.h and give us exact version of mkl

smth like this:

#define __INTEL_MKL__ 11
#define __INTEL_MKL_MINOR__ 0
#define __INTEL_MKL_UPDATE__ 2
gn164
Beginner
186 Views

 

Greetings Gennady,

This is linux, I have set the MKL_NUM_THREADS to 1 but I see no difference in the timings.

The versions I am comparing are:

#define __INTEL_MKL__ 11

#define __INTEL_MKL_MINOR__ 1

#define __INTEL_MKL_UPDATE__ 1

 

#define __INTEL_MKL__ 2018

#define __INTEL_MKL_MINOR__ 0

#define __INTEL_MKL_UPDATE__ 2

If that helps, the profiling of the test program linked with those is:

 mkl 11.1.1

time   seconds   seconds    calls  ms/call  ms/call  name

40.28      0.29     0.29        1   290.00   290.00  MAIN__
26.39      0.48     0.19                             mkl_blas_avx_ctbsv_vial1
16.67      0.60     0.12                             mkl_blas_avx_xcdotc
12.50      0.69     0.09                             mkl_blas_avx_xcaxpy_a
 1.39      0.70     0.01                             mkl_blas_ctbsv
 1.39      0.71     0.01                             mkl_lapack_cpbtrs
 1.39      0.72     0.01                             mkl_lapack_ilaenv

 mkl 2018

14.29      0.28     0.28        1   280.00   280.00  MAIN__
13.27      0.54     0.26                             mkl_blas_avx_cgemm_pst
  8.67      0.71     0.17                             mkl_lapack_xcpbtrs
  8.16      0.87     0.16                             mkl_blas_avx_ctrmv_in
  6.63      1.00     0.13                             mkl_blas_avx_ctrsv_ucn
  5.61      1.11     0.11                             mkl_blas_avx_ctrsv_unn
  5.10      1.21     0.10                             mkl_blas_avx_xcaxpy
  5.10      1.31     0.10                             mkl_lapack_ilaenv
  4.85      1.41     0.10                             mkl_blas_avx_ctrsv
  4.08      1.49     0.08                             mkl_blas_avx_xscopy
  3.32      1.55     0.07                             mkl_blas_avx_xctrmv
  2.55      1.60     0.05                             mkl_blas_ctrsv
  2.04      1.64     0.04                             mkl_blas_cgemm
  2.04      1.68     0.04                             mkl_blas_cgemm_omp_driver_v1
  1.53      1.71     0.03                             mkl_blas_xctrmv
  1.28      1.74     0.03                             mkl_blas_avx_xccopy
  1.28      1.76     0.03                             mkl_blas_xcgemm
  1.02      1.78     0.02                             LY16_A16_j2_i8gas_1
  1.02      1.80     0.02                             mkl_blas_avx_xcgemm
  1.02      1.82     0.02                             mkl_blas_cgemm_host
  1.02      1.84     0.02                             mkl_serv_cbwr_get
  0.51      1.85     0.01                             LY16_A16_j2gas_1
  0.51      1.86     0.01                             Lend_Y16_A16_j2gas_1
  0.51      1.87     0.01                             mkl_blas_avx_cgemm_get_optimal_kernel
  0.51      1.88     0.01                             mkl_blas_avx_cgemm_zero_desc
  0.51      1.89     0.01                             mkl_blas_avx_cgemv_n_even
  0.51      1.90     0.01                             mkl_blas_avx_xcgemv
  0.51      1.91     0.01                             mkl_blas_cgemv
  0.51      1.92     0.01                             mkl_blas_ctrmv
  0.51      1.93     0.01                             mkl_blas_xcgemv
  0.51      1.94     0.01                             mkl_lapack_cpbtrs
  0.26      1.95     0.01                             mkl_blas_avx_xctrmv_in_thread
  0.26      1.95     0.01                             mkl_blas_get_kernel_api_version
  0.26      1.96     0.01                             mkl_serv_get_num_stripes
  0.26      1.96     0.01                             mkl_serv_omp_in_parallel

 

gn164
Beginner
186 Views

 

Hi,

A followup to this, similar slowdown can be observed in the cpbtrf function in mkl 18.0

Gennady_F_Intel
Moderator
186 Views

we confirmed this issue the issue is escalated. The thread would be updated asap.

Gennady_F_Intel
Moderator
186 Views

the fix of the problem available into latest MKL 2019 update 1 which is released recently. Could you please take and try this update and let us know how this works on your side.

gn164
Beginner
186 Views

Greetings Gennady,

Thank you for the fix.

Do you know if there any other mkl (lapack or non-lapack) functions that are slower in mkl 18.0 and could be affected by the fix made in MKL 2019 update 1.

 

Gennady_F_Intel
Moderator
186 Views

in addition to this routine, some performance degradation of MKL PARDISO has been fixed in MKL 2019.

Reply