MKL’s vdsin is slower than the intrinsic sin?

CRquantum · ‎04-04-2022

I use Intel OneAPI 2022.0.3, linked with Intel MKL cluster library as below

I have a simple code as below,

!include "_rms.fi"
program avx
implicit none
!include "mkl_vml.f90"
integer, parameter :: dp = kind(0.d0)
real(dp) :: t1, t2, r

call cpu_time(t1)
r = f(100000000)
call cpu_time(t2)

print *, "Time", t2-t1
print *, r

contains

    real(dp) function f(N) result(r)
    integer, intent(in) :: N
    integer :: i
    real(dp) :: j(N)   
    call vdsin(N,(dble([(i,i=1,N)])),j)  
    r = sum(j)   
    !r = sum(sin(dble([(i,i=1,N)])))
    return
    end function

end program

I found that using MKL's vdsin is 2 times slower than simply using sin.

I mean, if I do

r = sum(sin(dble([(i,i=1,N)])))

it is two times faster than using MKL’s vdsin as below

    call vdsin(N,(dble([(i,i=1,N)])),j)
    r = sum(j)

Does anyone know why MKL's vdsin is slower than the intrinsic sin? Or how to correctly use MKL's vdsin or MKL in general?

Thanks much in advance!

PS.

A similar post is here below too,

https://fortran-lang.discourse.group/t/why-mkls-vdsin-is-slower-than-the-intrinsic-sin/3108/3

Barbara_P_Intel · ‎04-04-2022

I'm moving this over to the MKL Forum. They will be able to help you better.

CRquantum · ‎04-04-2022

OK, thank you Barbara.

ShanmukhS_Intel · ‎04-06-2022

Hi,

Thank you for posting on Intel Communities.

>>Does anyone know why MKL's vdsin is slower than the intrinsic sin?

We would like to recommend you to try compiling and running the code using oneAPI command prompt and check if issue persists.

In addition, We are facing unhandled exception while running the shared code (Attached the screenshot for your reference). Could you please share us the Visual Studio project file of the sample project code you are using, so that it would help us in analyzing the issue better.

Best Regards,

Shanmukh.SS

CRquantum · ‎04-06-2022

Thanks Shanmukh.SS.

The file is attached. Please unzip it and open the sln file and build and run.

there two functions, f uses the intrinsic sin function, f2 usues MKL's vdsin.

It seems function f is about 2X faster than vdsin, at least on my laptop Thinkpad P72 with Xeon-2186m.

If there are any news please let me know.

Thank you very much indeed.

Best regards,

Rong

ShanmukhS_Intel · ‎04-14-2022

Hi,

Thanks for sharing the project file and necessary steps. We are working on your issue internally. We will get back to you soon with an update.

Best Regards,

Shanmukh.SS

ShanmukhS_Intel · ‎04-18-2022

Hi,

Could you please try running the code once again by turning off all optimizations and check if any performance improvement in vdsin, as we have seen performance increase in vdsin after disabling optimizations.

Best Regards,

Shanmukh.SS

ShanmukhS_Intel · ‎04-25-2022

Hi,

Reminder:

Has the solution provided helped? Is your issue resolved? Could you please let us know if we could close this thread at our end.

Best Regards,

Shanmukh.SS

ShanmukhS_Intel · ‎05-02-2022

Hi,

We assume that your issue is resolved. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.

Best Regards,

Shanmukh.SS