- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For our research we want to compare the Intel MKL sparse matrix vector multiplication "mkl_dcsrmv"on a Pentium 4 2.4 Ghz and one core of the Woodcrest processor.
I read in the documentation that MKL uses OpenMP for threading, however even I set OMP_NUM_THREADS to 1 and MKL_SERIAL=yes I still see in V-Tune that this function will spread to multiple logical processors on the Woodcrest processor.
The function"mkl_dcsrmv" translates for the Pentium 4to "mkl_spblas_p4_dcsrmmsysm" and for the Woodcrestto "L__mlk_spblas_p4m_dcsrmmsym_304__par_loop0"
Even though this last function is called in a single thread, why does it spread out to multiple processors ? Can I deactivate this ?
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am guessing that you are running a single thread, but it is moving around among the cores. You could boot up your OS with 1 processor, or you could use the scheme appropriate to your OS for requesting the job to stay on a specified core (Windows Task Manager affinity box checks, or linux taskset or numactl, depending on your OS version). There is a good chance it will perform better when restricted to a single core, if it has that core to itself. Also, there is a chance of getting more repeatable VTune results this way.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page