I'm running inference_engine/samples ./security_barrier_camera_sample with cmd:
$ ./security_barrier_camera_sample -i ~/Untitled.mp4 -m /home/shay/intel/computer_vision_sdk_2018.2.319/deployment_tools/intel_models/vehicle-license-plate-detection-barrier-0007/FP32/vehicle-license-plate-detection-barrier-0007.xml -l /home/shay/intel/computer_vision_sdk_2018.2.319/deployment_tools/inference_engine/lib/ubuntu_16.04/intel64/libMKLDNNPlugin.so InferenceEngine: API version ............ 1.1 Build .................. 12419 [ INFO ] Parsing input parameters [ INFO ] Reading input [ INFO ] Loading plugin CPU API version ............ 1.1 Build .................. lnx_20180510 Description ....... MKLDNNPlugin [ ERROR ] dlSym cannot locate method 'CreateMKLDNNExtension': /home/shay/intel/computer_vision_sdk_2018.2.319/deployment_tools/inference_engine/lib/ubuntu_16.04/intel64/libMKLDNNPlugin.so: undefined symbol: CreateMKLDNNExtension
and encountered this error message: cannot locate method 'CreateMKLDNNExtension'
I'm trying to use MKLDNN to improve the performance.
Anyone can help me with this?
Natively if you don't specify any device by "-d" it will run the sample on the CPU and it will load the MKLDNN by default. The -l extension is used to accept the path that has any custom extensions("custom layer") shared library to load. So in your case you can just drop the -l parameter if you don't have any custom layer shared libraries that you'd like to point to.
Also, the way you should run this sample is -m <vehicle_detection_model.xml> -m_lpr <lpr model.xml> -m_va <vehicle_attributes_model.xml> . You can also run the sample with the -h command to see the help menu to understand all of the command line parameters and the ReadMe.md file is also helpful to read and understand as well.
Thanks for your immediate reply.
But if the sample will load MKLDNN by default, why I got some weird performance while running vehicle-attributes-recognition-barrier-0039.
In your docs, you said this model can get 1469.95 fps with IE MKLDNN. But when I ran this model, I got 0.7ms~15ms/pcs. I ran this model after a vehicle detection network, and my CPU is i3-8350K @ 4Ghz x 4. Why did I get performance like 7ms/pcs, 15ms/pcs? Is there some resolution limitations? Do I need resize the object before enqueue it into the network?