Software Tuning, Performance Optimization & Platform Monitoring
Discussion around monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform monitoring

simatcopy VS somatcopy performance

JoaoAlves95
Novice
304 Views

Good Afternoon,

I've noticed that simatcopy outperforms somatcopy for n x n square matrices. Only the execution time of simat/somat function was measured. These functions were called with the following parameters:

  mkl_simatcopy('R' /* row-major ordering */,
                               'T' /* A will be transposed */,
                               n /* rows */,
                               n /* cols */,
                               1. /* scales the input matrix */,
                               src /* source matrix */,
                               n /* src_stride */,
                               n /* dst_stride */);

  mkl_somatcopy('R' /* row-major ordering */,
                                'T' /* A will be transposed */,
                                n /* rows */,
                                n /* cols */,
                                1. /* scales the input matrix */,
                                src /* source matrix */,
                                n /* src_stride */,
                                dst /* destination matrix */,
                                n /* dst_stride */);

 

From what I understood in-place matrix transposition should be less efficient than its out-of-place counterpart. Isn't this true for square matrices?

I would also appreciate any insight on simatcopy memory complexity and which optimization techniques  were used on this function.

 

Best Regards,

João Alves

 

0 Kudos
0 Replies
Reply