I tried to do some test on the stick2. I ran the demo_squeezenet_download_convert_run.sh on stick2 and get the result
total inference time: 9.0164393.
This is similar with the post: https://ncsforum.movidius.com/discussion/1329/lattepanda-alpha-openvino-cpu-core-m3-vs-ncs1-vs-ncs2-performance-comparison.
According to the test, I can achieve110FPS. And the total workload (MACS) of squeezenet1.1 is 360MFLOPs. The throughput is about 79.2GOPS.
The stick2 has the capability of 1TOPS for neural network and 4TOPS computation. While the test only consumed less than 10% of the capability.
Could anyone tell me whether it is normal or there is anything I missed?
Are you using OpenVINO 2019 R1? Updating your SDK to the latest version should improve performance a little. Were you able to get results from testing other examples?
Hi, @Sahira_at_Intel ,
Thanks for your reply.
We are using OpenVINO 2019R1.
Could you give me some explaination of the test result or could you recommend me other test example to get higher throughput?
I just wanted to know how the performance of your NCS2 compares to others to make sure there's no other issues. You can try the security_barrier_camera_demo, interactive_face_detection_demo, segmentation_demo and I can compare your performance to mine. There can be some limitations with the system you're using as well - what are you using?
below is the command I tried and the corresponding result.
./interactive_face_detection_demo -i cam -m open_model_zoo/model_downloader/downloaded_models/Transportation/object_detection/face/pruned_mobilenet_reduced_ssd_shared_weights/dldt/face-detection-adas-0001-fp16.xml -d MYRIAD
For the test, the model has 2.83GFlops. The total throughput is about 31Gops. (please point out if I am wrong.)
CPU: 8-cores, Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
OS: Ubuntu 16.04 64bit
And I also run the GoogLenet v1 test.
./classification_sample -d MYRIAD -i ~/Downloads/car_1.jpg -m openvino/open_model_zoo/model_downloader/downloaded_models/classification/googlenet/v1/caffe/googlenet-v1.xml
I found the test result from Intel web, about 80fps( https://software.intel.com/en-us/neural-compute-stick). Even with the 80fps, the throughput is about 250GOPS, 25% usage of the capability. Does the 1TOPS in the stick2 means INT8 operation performance or else?