The "UNet" model did not work in NCSDK, but it worked in OpenVINO.
"UNet" is Semantic Segmentation model.
Interestingly, The CPU had better performance than "Neural Compute Stick" and "Neural Compute Stick 2".
For the moment, I do not feel utility value for NCS2.
Introducing Ubuntu 16.04 + OpenVINO to Latte Panda Alpha 864 (without OS included) and enjoying Sema...
I bought four NCS2, I will verify how useful
Multiple NCS Devices below is at a later date.
Multiple NCS Devices
Did you compare the performance of NCS on NCSDK and OpenVINO? I just ran a customized Densenet on NCS@NCSDK, NCS@OpenVINO, NCS@OpenVINO. Only Conv, Concat, Relu and BatchNorm layers exist in this network, and the results are so incredibly different that I'm wondering if I did something wrong… NCS@NCSDK takes 0.45s for one inference while NCS@OpenVINO takes 0.65s, and NCS2@OpenVINO takes 0.001s !?
Did you compare the performance of NCS on NCSDK and OpenVINO?
No. My "UNet" model did not work in NCSDK.
Therefore, unfortunately it can not be verified with NCSDK.
NCS, NCS2 = FP16
CPU = FP32
I used the conversion script below.
For FP16 (For NCS/NCS2)
$ sudo python3 mo_tf.py \ --input_model 01_pbmodels/UNet/semanticsegmentation_frozen_person_32.pb \ --output_dir 10_lrmodels/UNet/FP16 \ --input input \ --output output/BiasAdd \ --data_type FP16 \ --batch 1
For FP32 (For CPU)
$ sudo python3 mo_tf.py \ --input_model 01_pbmodels/UNet/semanticsegmentation_frozen_person_32.pb \ --output_dir 10_lrmodels/UNet/FP32 \ --input input \ --output output/BiasAdd \ --data_type FP32 \ --batch 1
Because your model and my model type are different, I can not simply compare performance.
If your model can be provided, I may be able to verify.
I put my model file in dropbox. Do you mind running a test with it on your hardware? The input node is named "input" and has a shape of (1, 32, 840,3). The output node is named "output" and has a shape of (1,1,794745).
I made free time so I measured it.
Unfortunately, NCS2 got the following error and it was impossible to measure.
E: [xLink] [ 0] dispatcherEventReceive:308 dispatcherEventReceive() Read failed -4 | event 0x7fa9fb7fdef0 USB_READ_REL_RESP E: [xLink] [ 0] eventReader:254 eventReader stopped E: [xLink] [ 0] dispatcherWaitEventComplete:694 waiting is timeout, sending reset remote event E: [ncAPI] [ 0] ncFifoReadElem:2853 Packet reading is failed. E: [ncAPI] [ 0] ncFifoDestroy:2672 Failed to write to fifo before deleting it!
Again, the CPU is overwhelmingly faster.
All measurement units are milliseconds.
My latest test results are pretty much consistent with yours. I think OpenVINO is doing some very tricky optimization for CPU Arch inside, so it benefits their CPU the most and the performance boost kinda depends on network structure too.
In fact, the demo programs inside OpenVINO should be an easy an fair test for NCS2. I ran demo_squeezenet_download_convert_run.sh on 1 super old CPU, 1 modern CPU, NCS and NCS2, and the results are as follows,
| | | |
| Hardware | Time Consumption | Command |
| Intel® Celeron® Processor J1900 | 42.52ms | demo_squeezenet_download_convert_run.sh -d CPU |
| Intel(R) Xeon(R) CPU E5-1603 v4 | 3.61ms | demo_squeezenet_download_convert_run.sh -d CPU |
| NCS | 28.67ms | demo_squeezenet_download_convert_run.sh -d MYRIAD |
| NCS2 | 9.34ms | demo_squeezenet_download_convert_run.sh -d MYRIAD |
It seems that a modern CPU with OpenVINO is indeed much faster than NCS2
310mA - 370mA with USB2.0 port.
Unfortunately, my measuring device does not support USB 3.0.
may I ask NCS2 can achieve MTCNN with OpenVINO?
Since OpenVINO accepts only input of fixed scale and fixed batch, I think that it will not move without trying.
Probably, unless you devise something, the standard repository program will not work.
If possible, I would like you to try it.
Here's a demo sample of face landmark detection.
By the way, my CPU is Core m3, so I think it's a bit cheap.
Based on the results of the survey, if you boost the speed to the maximum without GPU, I think you should use i7 or higher.
However, I do not recommend it at all because buying NCS2 and i7 costs high.
And, power consumption also increases according to CPU performance.
I think that it is good to purchase NCS2 after waiting until OpenVINO is compatible with ARM.
Did you get an up board when it was sale for $40 from intel?
$40 !? Is not it a mistake of $170?
It seems I missed the opportunity…
Going to see how that works with the ncs.
If possible, please tell us the result.
Is CPU "Intel Atom"?
Thank you for providing the information. Bob.
I am very interested in how much performance it get.
OpenVINO uses "MKL-DNN" to make parallel inference by multithreading inside the CPU and seems to realize high speed.
This seems to be a mechanism to increase the performance depending on the number of CPU cores, the performance of each core, and the total number of threads.
The boards shipped but now they wont get here until Tuesday. So by the end of the week should have it.
I posted here and suddenly got my download link and grabbed as much as I could while at work. Will try to install the linux version on my old laptop