Performance issue about the stick 2.

idata · ‎04-23-2019

Hi,

I tried to do some test on the stick2. I ran the demo_squeezenet_download_convert_run.sh on stick2 and get the result

total inference time: 9.0164393.

This is similar with the post: https://ncsforum.movidius.com/discussion/1329/lattepanda-alpha-openvino-cpu-core-m3-vs-ncs1-vs-ncs2-performance-comparison.

According to the test, I can achieve110FPS. And the total workload (MACS) of squeezenet1.1 is 360MFLOPs. The throughput is about 79.2GOPS.

The stick2 has the capability of 1TOPS for neural network and 4TOPS computation. While the test only consumed less than 10% of the capability.

Could anyone tell me whether it is normal or there is anything I missed?

Thanks

Honglei

idata · ‎04-23-2019

Hi @Honglei

Are you using OpenVINO 2019 R1? Updating your SDK to the latest version should improve performance a little. Were you able to get results from testing other examples?

Best Regards,

Sahira

idata · ‎04-24-2019

Hi, @Sahira_at_Intel ,

Thanks for your reply.

We are using OpenVINO 2019R1.

Could you give me some explaination of the test result or could you recommend me other test example to get higher throughput?

Thanks

Honglei

idata · ‎04-25-2019

Hi @Honglei

I just wanted to know how the performance of your NCS2 compares to others to make sure there's no other issues. You can try the security_barrier_camera_demo, interactive_face_detection_demo, segmentation_demo and I can compare your performance to mine. There can be some limitations with the system you're using as well - what are you using?

Best Regards,

Sahira

idata · ‎04-26-2019

Hi Sahira,

below is the command I tried and the corresponding result.

./interactive_face_detection_demo -i cam -m open_model_zoo/model_downloader/downloaded_models/Transportation/object_detection/face/pruned_mobilenet_reduced_ssd_shared_weights/dldt/face-detection-adas-0001-fp16.xml -d MYRIAD

result: 11fps

For the test, the model has 2.83GFlops. The total throughput is about 31Gops. (please point out if I am wrong.)

My host:

HP ZBook

CPU: 8-cores, Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz

Memory: 15.5G

OS: Ubuntu 16.04 64bit

And I also run the GoogLenet v1 test.

./classification_sample -d MYRIAD -i ~/Downloads/car_1.jpg -m openvino/open_model_zoo/model_downloader/downloaded_models/classification/googlenet/v1/caffe/googlenet-v1.xml

results：43fps

I found the test result from Intel web, about 80fps( https://software.intel.com/en-us/neural-compute-stick). Even with the 80fps, the throughput is about 250GOPS, 25% usage of the capability. Does the 1TOPS in the stick2 means INT8 operation performance or else?

Thanks

Honglei