Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.

Performance issue about the stick 2.

idata
Employee
1,040 Views

Hi,

 

I tried to do some test on the stick2. I ran the demo_squeezenet_download_convert_run.sh on stick2 and get the result

 

total inference time: 9.0164393.

 

This is similar with the post: https://ncsforum.movidius.com/discussion/1329/lattepanda-alpha-openvino-cpu-core-m3-vs-ncs1-vs-ncs2-performance-comparison.

 

According to the test, I can achieve110FPS. And the total workload (MACS) of squeezenet1.1 is 360MFLOPs. The throughput is about 79.2GOPS.

 

The stick2 has the capability of 1TOPS for neural network and 4TOPS computation. While the test only consumed less than 10% of the capability.

 

Could anyone tell me whether it is normal or there is anything I missed?

 

Thanks

 

Honglei
0 Kudos
4 Replies
idata
Employee
706 Views

Hi @Honglei

 

Are you using OpenVINO 2019 R1? Updating your SDK to the latest version should improve performance a little. Were you able to get results from testing other examples?

 

Best Regards,

 

Sahira
0 Kudos
idata
Employee
706 Views

Hi, @Sahira_at_Intel ,

 

Thanks for your reply.

 

We are using OpenVINO 2019R1.

 

Could you give me some explaination of the test result or could you recommend me other test example to get higher throughput?

 

Thanks

 

Honglei
0 Kudos
idata
Employee
706 Views

Hi @Honglei

 

I just wanted to know how the performance of your NCS2 compares to others to make sure there's no other issues. You can try the security_barrier_camera_demo, interactive_face_detection_demo, segmentation_demo and I can compare your performance to mine. There can be some limitations with the system you're using as well - what are you using?

 

Best Regards,

 

Sahira
0 Kudos
idata
Employee
706 Views

Hi Sahira,

 

below is the command I tried and the corresponding result.

 

./interactive_face_detection_demo -i cam -m open_model_zoo/model_downloader/downloaded_models/Transportation/object_detection/face/pruned_mobilenet_reduced_ssd_shared_weights/dldt/face-detection-adas-0001-fp16.xml -d MYRIAD

 

result: 11fps

 

For the test, the model has 2.83GFlops. The total throughput is about 31Gops. (please point out if I am wrong.)

 

My host:

 

HP ZBook

 

CPU: 8-cores, Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz

 

Memory: 15.5G

 

OS: Ubuntu 16.04 64bit

 

And I also run the GoogLenet v1 test.

 

./classification_sample -d MYRIAD -i ~/Downloads/car_1.jpg -m openvino/open_model_zoo/model_downloader/downloaded_models/classification/googlenet/v1/caffe/googlenet-v1.xml

 

results:43fps

 

I found the test result from Intel web, about 80fps( https://software.intel.com/en-us/neural-compute-stick). Even with the 80fps, the throughput is about 250GOPS, 25% usage of the capability. Does the 1TOPS in the stick2 means INT8 operation performance or else?

 

Thanks

 

Honglei
0 Kudos
Reply