Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

slasrt2 performance

seyedalireza_y_
Beginner
456 Views

Hi I have a question regarding the performance of the slasrt2 function. I'm using vtune and am seeing the following results, which shows that my code is not getting the FPU peak or anywhere near that. I was wondering about the percentages of slasrt2 routines, is there any switch or environmental variable i need to use to override default setting of MKL to get higher FPU utilization? 

 

fpu.png

0 Kudos
1 Reply
Zhen_Z_Intel
Employee
456 Views

Hi,

The function ?lasrt2 is number sorting function actually used quick sorting/ insertion sorting(n<20). Within this calculation, there will be large amount of moving operation between memory to register. Only comparison operation for float number will use FPU. The time complexity of insertion sort is O(n^2), you could not reduce times of moving operation, and the complexity of quick sort is O(nlog n)~O(n^2). Different from multiplication operation, you could not ensure the FPU utilization for ?lasrt2 always provide a high performance.

My advise is that you could watch assembly code in Vtune, if there indeed remains many mov operation, it is very normal you could not get peak of FPU utilization.

Best regards,
Fiona

0 Kudos
Reply