Intel® oneAPI AI Analytics Toolkit
Find answers to your toolkit installation, configuration, and get-started questions.

TensorFlow numerical risk with OneDNN setting

shuoli
Beginner
305 Views

Hi,

We are excited about teh potential performance gain on TensorFlow with OneDNN Setting. However, we are trying to evaluate the risk of numerical difference of using OneDNN. 

Does OneDNN guarantee deterministic result for matrix multiplication, convolution, pooling and batch normalization operations for floating numbers, in both single-thread and multi-thread environment? We observed that some other library like libeigen produces different results for matrix multiplication on Intel IceLake CPU vs. earlier generations CPU because of L1 cache size difference. Does Tensorflow with OneDNN setting have the same problem?

Based on https://github.com/oneapi-src/oneDNN/issues/789, it seems OneDNN doesn’t support features such as CNR in MKL, is it still true?

 

0 Kudos
1 Solution
Ying_H_Intel
Employee
231 Views

Hi !

Thank you a lot to sharing the good result : about potential performance gain on TensorFlow with OneDNN Setting


Regarding the numerical consistent issue, you are right, it is related to ISA, instruction execution order and floating point computation nature. MKL have environment variables to control the consistent result on same machine with fixed instruction execution order. oneDNN is based on JIT, so still haven't such feature.


Only CPU Dispatcher Control — oneDNN v2.7.0 documentation (oneapi-src.github.io) may help someway about aligning the instruction between different generation machines.


Thanks

Ying H.

Intel AI Support


View solution in original post

2 Replies
Rahila_T_Intel
Moderator
241 Views

Hi,


Thank you for posting in Intel Communities.


We are working on this internally and will share you the updates.


Thanks



Ying_H_Intel
Employee
232 Views

Hi !

Thank you a lot to sharing the good result : about potential performance gain on TensorFlow with OneDNN Setting


Regarding the numerical consistent issue, you are right, it is related to ISA, instruction execution order and floating point computation nature. MKL have environment variables to control the consistent result on same machine with fixed instruction execution order. oneDNN is based on JIT, so still haven't such feature.


Only CPU Dispatcher Control — oneDNN v2.7.0 documentation (oneapi-src.github.io) may help someway about aligning the instruction between different generation machines.


Thanks

Ying H.

Intel AI Support


Reply