Intel® Optimized AI Frameworks
Receive community support for questions related to PyTorch* and TensorFlow* frameworks.
73 Discussions

How to optimize tensorflow2/keras on a machine with two XEON Gold 6230 CPUs?

davideps
Novice
3,348 Views

I'm running on a Windows 10 Enterprise 64bit machine with two XEON Gold 6230 CPUs (20 physical cores each) and Anaconda Python 3.8.8 64bit. I installed the packages with

conda install tensorflow-mkl keras -c anaconda

I'm using mnist_convnet.py  to experiment with configurations with the goal of maximizing usage of both CPUs.

By default, the code uses all cores on a single CPU. I then added "config" to the imports and these lines to the code:

config.threading.set_inter_op_parallelism_threads(0)
config.threading.set_intra_op_parallelism_threads(0)
config.set_soft_device_placement(True)

This had no impact. Changing "set_inter_op_parallelism_threads" to 2 (the value I expected to trigger usage of both CPUs) had no effect either. All other settings I tried greatly reduced performance. I have several interrelated questions:

1. How can I get tensorflow/keras to use both CPUs?
2. Did I chose a poor example for multiCPU execution? If so, what is a better example?
3. Despite specifying `tensorflow-mkl` on install, the sanity check fails (result is False). Does that explain this problem? If so, how can I fix it?

Labels (1)
0 Kudos
12 Replies
JoseH_Intel
Moderator
3,322 Views

Hello davideps,


Thank you for joining the Intel community


Please allow us some time to research on your question. We will get back to you as soon as we have updates.


Regards


Jose A.

Intel Customer Support Technician

For firmware updates and troubleshooting tips, visit:

https://intel.com/support/serverbios


0 Kudos
davideps
Novice
3,276 Views

Hi Jose, thank you for your response. Can you tell me if anyone else has reported this issue on machines with two chips (any model)  and whether you can recreate the problem based on the code I supplied?

0 Kudos
AthiraM_Intel
Moderator
3,265 Views

Hi,


To maximize Tensorflow performance on CPU, you could use some parameter settings such as intra_/inter_op_parallelism_threads,Data Layout, KMP_AFFINITY, KMP_BLOCKTIME, OMP_NUM_THREADS etc. The recommended settings are available in the below link:

https://software.intel.com/content/www/us/en/develop/articles/maximize-tensorflow-performance-on-cpu-considerations-and-recommendations-for-inference.html

Please follow this article for openmp settings.


Regarding the installation, you could use installation option from the below link:

https://software.intel.com/content/www/us/en/develop/articles/intel-optimization-for-tensorflow-installation-guide.html


For windows , you can use any of the below commands or you can build tensorflow from source


conda install tensorflow-mkl

conda install tensorflow-mkl -c anaconda


The steps to build tensorflow from source is available in the above documentation.


We are checking on your other queries internally, will get back to you soon with an update.


Thanks.


0 Kudos
AthiraM_Intel
Moderator
3,250 Views

Hi,


Regarding your second query, "Did I chose a poor example for multi CPU execution? If so, what is a better example?":


You could use the same sample (mnist_convnet.py), it will work fine with multi-threading. 


Regarding the sanity check, we are checking from our end, will let you know the updates soon.


Could you please let us know the version of tensorflow you are using?


Thanks.


0 Kudos
davideps
Novice
3,238 Views

Hi Athira,

"conda list" shows:

 

tensorflow 2.3.0 mkl_py38h37f7ee5_0
tensorflow-base 2.3.0 eigen_py38h75a453f_0
tensorflow-estimator 2.3.0 pyheb71bc4_0 anaconda
tensorflow-mkl 2.3.0 h93d2e19_0

0 Kudos
AthiraM_Intel
Moderator
3,220 Views

Hi,


Regarding the sanity check, we are able to reproduce the issue. We are checking internally on the issue, will let you know the updates soon.


Thanks.


0 Kudos
davideps
Novice
3,214 Views

Thanks Athira. Good to know that it wasn't me failing the sanity check

Would the problem cause config settings like this (below) to misbehave?

 

 

 

config.threading.set_inter_op_parallelism_threads(2)
config.threading.set_intra_op_parallelism_threads(0)

 

 

 

One of my initial questions was whether I need distributed workers to get both XEON CPUs on a single machine to share the load or whether the XEON platform should do this without distributed workers. Of course, I'm hoping distributed workers aren't necessary on a single machine since I believe that approach is designed for multiple machines in a network and will be slower than two CPUs that already share memory.

0 Kudos
davideps
Novice
3,166 Views

Hi Athira. Is there any update on this issue or an estimate of when it might be resolved?

0 Kudos
Ying_H_Intel
Employee
3,111 Views

Hi David,


Sorry for the delay. Could you please help to check your python version and tensorflow version?

>python

exit()


We did the investigation on the problem and find that


For Windows with Python 3.8, we got TF v2.3, but oneDNN is not enabled in this binary.

For Windows with python 3.7, we got TF v2.1, but oneDNN is enabled in this binary


Therefore, for got the intel optimized TF, you may have python 3.7 and TF 2.1 installed.


Thanks

Ying


0 Kudos
davideps
Novice
3,099 Views

I'm using Python 3.8 and TF 2.3. I'll downgrade both. Thank you!

0 Kudos
Ying_H_Intel
Employee
3,076 Views

Hi David, 

 

is the new version work? Please feel free to let us know if any further problem. 

 

further,  just for your reference:   using Python 3.8 ,  intel TF 2. 4 is ready now, which can be installed by

pip install intel-tensorflow==2.4.0

 

Ref link:  install guide:  https://software.intel.com/content/www/us/en/develop/articles/intel-optimization-for-tensorflow-installation-guide.html

performance considerations: 

https://software.intel.com/content/www/us/en/develop/articles/maximize-tensorflow-performance-on-cpu-considerations-and-recommendations-for-inference.html

Best Regards,

Ying

 

0 Kudos
Ying_H_Intel
Employee
2,818 Views

Hi David,

 

Hope everything goes well.

 

It is my pleasure to notify you that the release of Intel® Optimizations for Tensorflow v2.6.0 for Linux and windows platforms are available. You are welcomed to try the latest version and let us know if any issues.


And as the issue was open for a few months, i go ahead to close the issue. Please feel free to update us if any news.

 

Thanks

Ying


0 Kudos
Reply