- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello everyone.
The "UNet" model did not work in NCSDK, but it worked in OpenVINO.
"UNet" is Semantic Segmentation model.
https://github.com/PINTO0309/TensorflowLite-UNet/raw/master/model/semanticsegmentation_frozen_person_32.pb
Interestingly, The CPU had better performance than "Neural Compute Stick" and "Neural Compute Stick 2".
For the moment, I do not feel utility value for NCS2.
◆Japanese Article
Introducing Ubuntu 16.04 + OpenVINO to Latte Panda Alpha 864 (without OS included) and enjoying Semantic Segmentation with Neural Compute Stick and Neural Compute Stick 2
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I bought four NCS2, I will verify how useful Multiple NCS Devices
below is at a later date.
Multiple NCS Devices
https://software.intel.com/en-us/articles/transitioning-from-intel-movidius-neural-compute-sdk-to-openvino-toolkit
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Did you compare the performance of NCS on NCSDK and OpenVINO? I just ran a customized Densenet on NCS@NCSDK, NCS@OpenVINO, NCS@OpenVINO. Only Conv, Concat, Relu and BatchNorm layers exist in this network, and the results are so incredibly different that I'm wondering if I did something wrong… NCS@NCSDK takes 0.45s for one inference while NCS@OpenVINO takes 0.65s, and NCS2@OpenVINO takes 0.001s !?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Gemini91
Did you compare the performance of NCS on NCSDK and OpenVINO?
No. My "UNet" model did not work in NCSDK.
Therefore, unfortunately it can not be verified with NCSDK.
NCS, NCS2 = FP16
CPU = FP32
I used the conversion script below.
For FP16 (For NCS/NCS2)
$ sudo python3 mo_tf.py \
--input_model 01_pbmodels/UNet/semanticsegmentation_frozen_person_32.pb \
--output_dir 10_lrmodels/UNet/FP16 \
--input input \
--output output/BiasAdd \
--data_type FP16 \
--batch 1
For FP32 (For CPU)
$ sudo python3 mo_tf.py \
--input_model 01_pbmodels/UNet/semanticsegmentation_frozen_person_32.pb \
--output_dir 10_lrmodels/UNet/FP32 \
--input input \
--output output/BiasAdd \
--data_type FP32 \
--batch 1
Because your model and my model type are different, I can not simply compare performance.
If your model can be provided, I may be able to verify.
Same issue
https://ncsforum.movidius.com/discussion/1320/slow-fps-on-neural-compute-stick-2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I put my model file in dropbox. Do you mind running a test with it on your hardware? The input node is named "input" and has a shape of (1, 32, 840,3). The output node is named "output" and has a shape of (1,1,794745).
https://www.dropbox.com/s/snbgwzj9p2xkwpm/densenet_frozen.pb?dl=0
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Gemini91
OK.
However, Japan is already late night so I do not have any working hours. Please wait for a few days.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Gemini91
I made free time so I measured it.
Unfortunately, NCS2 got the following error and it was impossible to measure.
E: [xLink] [ 0] dispatcherEventReceive:308 dispatcherEventReceive() Read failed -4 | event 0x7fa9fb7fdef0 USB_READ_REL_RESP
E: [xLink] [ 0] eventReader:254 eventReader stopped
E: [xLink] [ 0] dispatcherWaitEventComplete:694 waiting is timeout, sending reset remote event
E: [ncAPI] [ 0] ncFifoReadElem:2853 Packet reading is failed.
E: [ncAPI] [ 0] ncFifoDestroy:2672 Failed to write to fifo before deleting it!
Again, the CPU is overwhelmingly faster.
All measurement units are milliseconds.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My latest test results are pretty much consistent with yours. I think OpenVINO is doing some very tricky optimization for CPU Arch inside, so it benefits their CPU the most and the performance boost kinda depends on network structure too.
In fact, the demo programs inside OpenVINO should be an easy an fair test for NCS2. I ran demo_squeezenet_download_convert_run.sh on 1 super old CPU, 1 modern CPU, NCS and NCS2, and the results are as follows,
| | | |
|:---------------------------------:|:------------------:|:---------------------------------------------------:|
| Hardware | Time Consumption | Command |
| Intel® Celeron® Processor J1900 | 42.52ms | demo_squeezenet_download_convert_run.sh -d CPU |
| Intel(R) Xeon(R) CPU E5-1603 v4 | 3.61ms | demo_squeezenet_download_convert_run.sh -d CPU |
| NCS | 28.67ms | demo_squeezenet_download_convert_run.sh -d MYRIAD |
| NCS2 | 9.34ms | demo_squeezenet_download_convert_run.sh -d MYRIAD |
It seems that a modern CPU with OpenVINO is indeed much faster than NCS2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Gemini91
Thank you for providing detailed information.
It was very helpful.
It seems to be meaningful for the combination of low performance CPU and NCS2.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@PINTO Thanks for provide info about NCS and NCS2 performance.But Power Consumption is also important. btw, may I ask NCS2 can achieve MTCNN with OpenVINO?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@curry_best
310mA - 370mA with USB2.0 port.
Unfortunately, my measuring device does not support USB 3.0.
may I ask NCS2 can achieve MTCNN with OpenVINO?
Since OpenVINO accepts only input of fixed scale and fixed batch, I think that it will not move without trying.
Probably, unless you devise something, the standard repository program will not work.
https://github.com/ipazc/mtcnn.git
https://github.com/AITTSMD/MTCNN-Tensorflow.git
https://github.com/CongWeilin/mtcnn-caffe.git
If possible, I would like you to try it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@PINTO thanks, I would like to try it if I get a NCS2.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@curry_best
Here's a demo sample of face landmark detection.
1.face detection
2.gender
3.head pose
4.emotions
5.facial landmarks
https://software.intel.com/en-us/articles/OpenVINO-InferEngine#inpage-nav-7-12
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello.
I implemented real-time semantic segmentation with OpenVINO and CPU only (LattePanda Alpha).
0.9 FPS - 1.0 FPS
OpenVINO + ADAS(Semantic Segmentaion) + Python3.5
https://github.com/PINTO0309/OpenVINO-ADAS.git
https://youtu.be/R0dtm30qazM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So skip the stick spend more money on an i7?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@chicagobob123
By the way, my CPU is Core m3, so I think it's a bit cheap.
Based on the results of the survey, if you boost the speed to the maximum without GPU, I think you should use i7 or higher.
However, I do not recommend it at all because buying NCS2 and i7 costs high.
And, power consumption also increases according to CPU performance.
I think that it is good to purchase NCS2 after waiting until OpenVINO is compatible with ARM.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Did you get an up board when it was sale for $40 from intel? I got two since they are way more powerful than a pi. Going to see how that works with the ncs.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@chicagobob123
Did you get an up board when it was sale for $40 from intel?
$40 !? Is not it a mistake of $170?
How affordable!!
It seems I missed the opportunity…
Going to see how that works with the ncs.
If possible, please tell us the result.
Is CPU "Intel Atom"?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes its an Atom processor with similar connections to Pi.
4GB DDR3L-1600
Intel® Atom™ x5-Z8350
Sadly they are 89 dollars again
https://click.intel.com/aaeon-up-board.html
Bob
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@chicagobob123
Thank you for providing the information. Bob.
I am very interested in how much performance it get.
OpenVINO uses "MKL-DNN" to make parallel inference by multithreading inside the CPU and seems to realize high speed.
This seems to be a mechanism to increase the performance depending on the number of CPU cores, the performance of each core, and the total number of threads.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The boards shipped but now they wont get here until Tuesday. So by the end of the week should have it.
I posted here and suddenly got my download link and grabbed as much as I could while at work. Will try to install the linux version on my old laptop
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page