I'm running on a Windows 10 Enterprise 64bit machine with two XEON Gold 6230 CPUs (20 physical cores each) and Anaconda Python 3.8.8 64bit. I installed the packages with
conda install tensorflow-mkl keras -c anaconda
I'm using mnist_convnet.py to experiment with configurations with the goal of maximizing usage of both CPUs.
By default, the code uses all cores on a single CPU. I then added "config" to the imports and these lines to the code:
config.threading.set_inter_op_parallelism_threads(0) config.threading.set_intra_op_parallelism_threads(0) config.set_soft_device_placement(True)
This had no impact. Changing "set_inter_op_parallelism_threads" to 2 (the value I expected to trigger usage of both CPUs) had no effect either. All other settings I tried greatly reduced performance. I have several interrelated questions:
1. How can I get tensorflow/keras to use both CPUs?
2. Did I chose a poor example for multiCPU execution? If so, what is a better example?
3. Despite specifying `tensorflow-mkl` on install, the sanity check fails (result is False). Does that explain this problem? If so, how can I fix it?
Thank you for joining the Intel community
Please allow us some time to research on your question. We will get back to you as soon as we have updates.
Intel Customer Support Technician
For firmware updates and troubleshooting tips, visit:
To maximize Tensorflow performance on CPU, you could use some parameter settings such as intra_/inter_op_parallelism_threads,Data Layout, KMP_AFFINITY, KMP_BLOCKTIME, OMP_NUM_THREADS etc. The recommended settings are available in the below link:
Please follow this article for openmp settings.
Regarding the installation, you could use installation option from the below link:
For windows , you can use any of the below commands or you can build tensorflow from source
conda install tensorflow-mkl
conda install tensorflow-mkl -c anaconda
The steps to build tensorflow from source is available in the above documentation.
We are checking on your other queries internally, will get back to you soon with an update.
Regarding your second query, "Did I chose a poor example for multi CPU execution? If so, what is a better example?":
You could use the same sample (mnist_convnet.py), it will work fine with multi-threading.
Regarding the sanity check, we are checking from our end, will let you know the updates soon.
Could you please let us know the version of tensorflow you are using?
"conda list" shows:
tensorflow 2.3.0 mkl_py38h37f7ee5_0
tensorflow-base 2.3.0 eigen_py38h75a453f_0
tensorflow-estimator 2.3.0 pyheb71bc4_0 anaconda
tensorflow-mkl 2.3.0 h93d2e19_0
Thanks Athira. Good to know that it wasn't me failing the sanity check
Would the problem cause config settings like this (below) to misbehave?
One of my initial questions was whether I need distributed workers to get both XEON CPUs on a single machine to share the load or whether the XEON platform should do this without distributed workers. Of course, I'm hoping distributed workers aren't necessary on a single machine since I believe that approach is designed for multiple machines in a network and will be slower than two CPUs that already share memory.
Sorry for the delay. Could you please help to check your python version and tensorflow version?
We did the investigation on the problem and find that
For Windows with Python 3.8, we got TF v2.3, but oneDNN is not enabled in this binary.
For Windows with python 3.7, we got TF v2.1, but oneDNN is enabled in this binary
Therefore, for got the intel optimized TF, you may have python 3.7 and TF 2.1 installed.
is the new version work? Please feel free to let us know if any further problem.
further, just for your reference: using Python 3.8 , intel TF 2. 4 is ready now, which can be installed by
pip install intel-tensorflow==2.4.0
Hope everything goes well.
It is my pleasure to notify you that the release of Intel® Optimizations for Tensorflow v2.6.0 for Linux and windows platforms are available. You are welcomed to try the latest version and let us know if any issues.
And as the issue was open for a few months, i go ahead to close the issue. Please feel free to update us if any news.