hello, I find the inter-optimized-tensorflow has the great increasing on train phase.
but i want to run 3 docker containters in 8 physical core 16cores Cpu,
i set every containter with 4 logical core how i set the param intra_/inter_op_parallelism_threads and OMP_NUM_THREADS?
when one containter runs, the train time cost 17s every epoch, but when i run 3 containters, in every containter the train time cost 50s/epoch.
by the way i set intra_/inter_op_parallelism_threads =2, OMP_NUM_THREADS= 2, KMP_BLOCKTIME=1 in containter.
please tell me why?
Thanks for reaching out to us.
But the issue seems to be duplicate of the provided link
We are working on your issue. Please refer the above link for further updates. We are closing this thread.