- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I converted a single layer CNN model with ~3k parameters and a single layer Fully Connected model with ~35k parameters from Tensorflow SavedModel to OpenVino IR format using the commands
mo --saved_model_dir dummy_cnn_model/
mo --saved_model_dir dummy_dnn_model/
The CNN model when tested with the OpenVino benchmarking tool has a latency of 65 ms
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 81.50 ms
[Step 11/11] Dumping statistics report
Count: 3700 iterations
Duration: 60152.08 ms
Latency:
Median: 58.95 ms
AVG: 64.50 ms
MIN: 30.26 ms
MAX: 275.16 ms
The Fully connected model when tested with the same OpenVino benchmarking tool has a latency of 0.06 ms.
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 1.13 ms
[Step 11/11] Dumping statistics report
Count: 2178004 iterations
Duration: 60000.10 ms
Latency:
Median: 0.06 ms
AVG: 0.07 ms
MIN: 0.03 ms
MAX: 19.31 ms
I understand that convolution operation is quite different from dense matrix multiplication but this comparison shows a difference in speed of 3 orders of magnitude. The dense single-layer model runs 1000x faster than the single-layer CNN model.
The main reason why I started this benchmark was to find out why my Tensorflow deeplab CNN openvino IR model was running with an FPS of 5 when a fully connected model ran significantly faster.
Could anyone please explain if this is due to Tensorflow because a yolov5-medium model in Pytroch that's similar in size to my Tensorflow Deeplab model gives a throughput of 25 FPS on the same machine. Check the specs of my local server down below and I am not running this test in colab.
System specs:
OS: Ubuntu 18.04.5 LTS x86_64
CPU: Intel Xeon Silver 4208 (32) @ 3.200GHz
GPU: NVIDIA NVIDIA Corporation Device 2206
Memory: 22813MiB / 31850MiB
Tensorflow version: 2.3.1
Openvino-dev version:
Name: openvino-dev
Version: 2022.2.0
Installed openvino toolkit using the command: pip install openvino-dev[tensorflow2]
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Cosmichunt,
Thanks for reaching out to us.
From your description, I assume you are comparing a TensorFlow DeepLab model (Segmentation model) with a YOLO-V5 Pytorch model (Object Detection model) which results in TensorFlow DeepLab model only run 5 FPS whereas YOLO-V5 PyTorch model can run 25 FPS.
If this is the case, the reason of getting obvious difference in the throughput is not due to TensorFlow versus PyTorch but due to different model architecture and algorithms. For Object Detection, the method localizes and classifies the object in the image based on bounding box coordinates. While, in image segmentation, the model has to detect the exact boundaries of the object, which makes the job heavier and hence slower.
Regards,
Peh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Peh_Intel But, what about the case where a single layer CNN model made with tensorflow and converted to Openvino that runs 10^3 times slower than a fully connected layer?
Regards
Anirudh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Anirudh,
Here is a Stack Overflow discussion on why Convolution layer is slower than fully connected layer:
Torch: why convolution layer is even slower than full connect linear layer when in same data size
On another note, if possible, please share both your models (single layer CNN model and single layer Fully Connected model) so that we can validate on our side as well.
Regards,
Peh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I finally understood a change that improved the performance. Tensorflow has tf.saved_model.save() and keras model.save(). When the model is saved using tf.saved_model.save( ), the inference using a single-CNN-layer-model goes up to 5000 FPS. But when I saved the model using the model.save() method, I observed that the FPS falls down to 7 for the same single-layer-CNN model.
Thanks for all your help and pointers!
Cheers,
Anirudh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Anirudh,
Thanks for letting us know and sharing the information here.
This thread will no longer be monitored since this issue has been resolved. If you need any additional information from Intel, please submit a new question.
Regards,
Peh
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page