- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I cannot utilize multi threading for the following functions. cblas_daxpy, cblas_dscal and cblas_dgemv. I have vectors and matrices with n=8000. But it looks like multi threading is not used. I arrived this conclusion by looking at system monitor. I am linking with "-L/opt/intel/mkl/10.0.2.018/lib/32 -lmkl -liomp5 -lpthread -lm".
I am usingmkl 10.0.2.018
Fedora 8 (32 bit)
g++ (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33)
Intel Core2 CPU T5600 @ 1.83GHz (Mac Mini)
mkl_domain_get_max_threads(MKL_BLAS) returns 2
mkl_get_dynamic() returns 1
mkl_get_max_threads() returns 2
mkl_set_dynamic(0) does not seem to force using 2 threads.
Is there anything else I should be doing?
Thanks.
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Level 1 BLAS functions, including ?axpy and ?dscal, generally aren't threaded, because vectorization is usually the only effective parallelization. A run time check for problem size at that level would hurt performance on shorter loops. Even ?gemv would usually get more benefit from threading if it is possible to thread in your code, so as to give each thread an independent instance of the function.
As your inner loops may be long enough to benefit from threading, since both cores share cache, you can thread them yourself with a more recent compiler, which includes OpenMP as well as auto-vectorization, such as g++ 4.3 or icpc.
As your inner loops may be long enough to benefit from threading, since both cores share cache, you can thread them yourself with a more recent compiler, which includes OpenMP as well as auto-vectorization, such as g++ 4.3 or icpc.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page