I want to use mkl_jit_create_cgemm on my setup where each thread is pinned to single core.
In each thread I'll do the cgemm with the created Jitter.
Do I need to create jit kernel specific for each thread? or create just 1 and use it on all calling threads?
You can call mkl_jit_create_cgemm once and use it in all calling threads.
Since generating the jit kernel using this function will involve runtime overhead, you will see performance improve if this kernel gets called several hundred of times.