only getting 50% throughput CPU-wise when using FEAST's eigensolver. i am aware that hyperthreading is merely emulation, but isn't it possible to get the solver to still exploit all available cores instead of just the "true" number?
i tried OMP_threads and KMP_threads to no avail.
hoping you guys can fix this in a future revision?
What do you mean by "50% throughput CPU-wise"? How is it measured?
The number of cores reported by the OS when hyperthreading is enabled is not the true number of physical cores. It's typically 2x the true number of physical cores. For CPU-bound computation such as FEAST, hyperthreading usually doesn't help and you'd better disable it.
MKL functions by default spawn no more than necessary number of threads. Sometimes this can be different than the value specified by OMP_NUM_THREADS. To force MKL to always respect OMP_NUM_THREADS setting, you can set env-variable MKL_DYNAMIC=0.
guess who's bizzack.
i need my black belt status still.
i noticed that with mkl_sparse_d_ev, i get full thread utilisation, whereas for d_scsrev, i do not.
as i said below (hopefully the techs can merge that account with my old one that they finally restored), dfeast_scsrev seems to only use half the number of threads on a dual socket mac pro (5,1).
someone told me this had to do with the NUMA v non-NUMA configurations, but that wouldn't explain why mkl_sparse_d_ev can use all of the cores across the two processors.
i'm using mac so i use istat pro when i run the program.
for every part of my program, i get 100% CPU usage (24 threads for 12 cores) without problem, except for when i run the feast eigensolver.
the only issue is that for the FEAST eigensolver, it seems to only use one of my processors, because i only see 50% CPU usage. this is independent of what monitoring software i use. i know it's limited to the eigensolver because every other part of my program is threaded and MKL handles it properly.
to me, it sounds like the eigensolver isn't using one processor. are dual processor configurations handled differently for the eigensolver?
am i being more clear with what i'm asking now? sorry for any confusion.