Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

lapack function cpbtrs slower in mkl 18.0 vs 14.0

gn164
Beginner
335 Views

 

Hi,

I am experiencing a slowdown of cpbtrs function in mkl 18.0 comparing with mkl 14.0. My system is a Xeon E3-1240 v3 

The following (single-threaded) code seems to run more than 2x slower with 18.0:

         niter = 100000
         n   = 60
         nbd = 11
         ldb = 181

         allocate(a(nbd*n,niter))
         allocate(b(ldb,niter))

         a = cmplx(0.1,0.1)
         b = cmplx(0.5,0.5)

         do iter = 1 ,niter
            call CPBTRS('U', n,nbd -1, 1, a(:,iter), nbd, b(:,iter),ldb, status)
         enddo

The linking command that I used with ifort 18.0:

$INTEL_HOME/ifort -I$MKL_HOME/include/ cpbtrs.f90 
-Wl,--start-group -Wl,-Bstatic -L$MKL_HOME_LIB/lib 
-lmkl_core -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_lapack95_lp64 -liomp5 -Wl,--end-group

    

         

 

0 Kudos
7 Replies
Gennady_F_Intel
Moderator
335 Views

You link with threaded version of MKL 2018. Do you set MKL_NUM_THREADS=1 to run this test into single-threaded mode?

Is that lin or Windows?  wrt mkl 14 - there is no such verions - could you loot in mkl.h and give us exact version of mkl

smth like this:

#define __INTEL_MKL__ 11
#define __INTEL_MKL_MINOR__ 0
#define __INTEL_MKL_UPDATE__ 2
0 Kudos
gn164
Beginner
335 Views

 

Greetings Gennady,

This is linux, I have set the MKL_NUM_THREADS to 1 but I see no difference in the timings.

The versions I am comparing are:

#define __INTEL_MKL__ 11

#define __INTEL_MKL_MINOR__ 1

#define __INTEL_MKL_UPDATE__ 1

 

#define __INTEL_MKL__ 2018

#define __INTEL_MKL_MINOR__ 0

#define __INTEL_MKL_UPDATE__ 2

If that helps, the profiling of the test program linked with those is:

 mkl 11.1.1

time   seconds   seconds    calls  ms/call  ms/call  name

40.28      0.29     0.29        1   290.00   290.00  MAIN__
26.39      0.48     0.19                             mkl_blas_avx_ctbsv_vial1
16.67      0.60     0.12                             mkl_blas_avx_xcdotc
12.50      0.69     0.09                             mkl_blas_avx_xcaxpy_a
 1.39      0.70     0.01                             mkl_blas_ctbsv
 1.39      0.71     0.01                             mkl_lapack_cpbtrs
 1.39      0.72     0.01                             mkl_lapack_ilaenv

 mkl 2018

14.29      0.28     0.28        1   280.00   280.00  MAIN__
13.27      0.54     0.26                             mkl_blas_avx_cgemm_pst
  8.67      0.71     0.17                             mkl_lapack_xcpbtrs
  8.16      0.87     0.16                             mkl_blas_avx_ctrmv_in
  6.63      1.00     0.13                             mkl_blas_avx_ctrsv_ucn
  5.61      1.11     0.11                             mkl_blas_avx_ctrsv_unn
  5.10      1.21     0.10                             mkl_blas_avx_xcaxpy
  5.10      1.31     0.10                             mkl_lapack_ilaenv
  4.85      1.41     0.10                             mkl_blas_avx_ctrsv
  4.08      1.49     0.08                             mkl_blas_avx_xscopy
  3.32      1.55     0.07                             mkl_blas_avx_xctrmv
  2.55      1.60     0.05                             mkl_blas_ctrsv
  2.04      1.64     0.04                             mkl_blas_cgemm
  2.04      1.68     0.04                             mkl_blas_cgemm_omp_driver_v1
  1.53      1.71     0.03                             mkl_blas_xctrmv
  1.28      1.74     0.03                             mkl_blas_avx_xccopy
  1.28      1.76     0.03                             mkl_blas_xcgemm
  1.02      1.78     0.02                             LY16_A16_j2_i8gas_1
  1.02      1.80     0.02                             mkl_blas_avx_xcgemm
  1.02      1.82     0.02                             mkl_blas_cgemm_host
  1.02      1.84     0.02                             mkl_serv_cbwr_get
  0.51      1.85     0.01                             LY16_A16_j2gas_1
  0.51      1.86     0.01                             Lend_Y16_A16_j2gas_1
  0.51      1.87     0.01                             mkl_blas_avx_cgemm_get_optimal_kernel
  0.51      1.88     0.01                             mkl_blas_avx_cgemm_zero_desc
  0.51      1.89     0.01                             mkl_blas_avx_cgemv_n_even
  0.51      1.90     0.01                             mkl_blas_avx_xcgemv
  0.51      1.91     0.01                             mkl_blas_cgemv
  0.51      1.92     0.01                             mkl_blas_ctrmv
  0.51      1.93     0.01                             mkl_blas_xcgemv
  0.51      1.94     0.01                             mkl_lapack_cpbtrs
  0.26      1.95     0.01                             mkl_blas_avx_xctrmv_in_thread
  0.26      1.95     0.01                             mkl_blas_get_kernel_api_version
  0.26      1.96     0.01                             mkl_serv_get_num_stripes
  0.26      1.96     0.01                             mkl_serv_omp_in_parallel

 

0 Kudos
gn164
Beginner
335 Views

 

Hi,

A followup to this, similar slowdown can be observed in the cpbtrf function in mkl 18.0

0 Kudos
Gennady_F_Intel
Moderator
335 Views

we confirmed this issue the issue is escalated. The thread would be updated asap.

0 Kudos
Gennady_F_Intel
Moderator
335 Views

the fix of the problem available into latest MKL 2019 update 1 which is released recently. Could you please take and try this update and let us know how this works on your side.

0 Kudos
gn164
Beginner
335 Views

Greetings Gennady,

Thank you for the fix.

Do you know if there any other mkl (lapack or non-lapack) functions that are slower in mkl 18.0 and could be affected by the fix made in MKL 2019 update 1.

 

0 Kudos
Gennady_F_Intel
Moderator
335 Views

in addition to this routine, some performance degradation of MKL PARDISO has been fixed in MKL 2019.

0 Kudos
Reply