Community
cancel
Showing results for 
Search instead for 
Did you mean: 
190 Views

NCS2 freezes when loading BERT model

Jump to solution

I used the instructions from here to convert a BERT model to IR: https://docs.openvinotoolkit.org/latest/_docs_MO_DG_prepare_model_convert_model_tf_specific_Convert_...

However, when I load it on NCS2, nothing happens; the IECore's load_network(network=net, device_name="MYRIAD") doesn't return. Strangely, the model is loaded on NCS1 (with the same method).

Do you know what might be the problem?

0 Kudos
1 Solution
190 Views

Hello Alexandru,

Thanks for your patience, please try the following steps to load successfully on the Intel® Neural Compute Stick 2:

  1. Turn off the VPU_HW_STAGES_OPTIMIZATION and re-test.
  2. Add the following after declaring IECore() on the demo file:

           if args.device == "MYRIAD":

                ie.set_config({'VPU_HW_STAGES_OPTIMIZATION': 'NO'}, "MYRIAD")

Best regards,

Randall B.

View solution in original post

6 Replies
SIRIGIRI_V_Intel
Employee
190 Views

Hi Alexandru,

It seems that the model has few unsupported layers(ReduceMean, Erf). Please check the supported layers for the MYRIAD. You can try implementing the unsupported layers using custom layers.

Regards,

Ram prasad

190 Views

Hi Ram, I am not sure that is the case, because the model is loaded and executed successfully on NCS1. You can try it yourself.

190 Views

Hello Alexandru,

I converted the TensorFlow* BERT Model to the Intermediate Representation successfully. 

I used on NCS1 and NCS2 the following command:

~/Downloads$ python3 /opt/intel/openvino/deployment_tools/model_optimizer/mo_tf.py --input_meta_graph multilingual_L-12_H-768_A-12/bert_model.ckpt.meta --output bert/pooler/dense/Tanh --disable_nhwc_to_nchw --input Placeholder{i32},Placeholder_1{i32},Placeholder_2{i32} 

And I get the following results:

Model Optimizer arguments: 

Common parameters: 

    - Path to the Input Model:     None 

    - Path for generated IR:     /home/ncs1/Downloads/. 

    - IR output name:     bert_model.ckpt 

    - Log level:     ERROR 

    - Batch:     Not specified, inherited from the model 

    - Input layers:     Placeholder{i32},Placeholder_1{i32},Placeholder_2{i32} 

    - Output layers:     bert/pooler/dense/Tanh 

    - Input shapes:     Not specified, inherited from the model 

    - Mean values:     Not specified 

    - Scale values:     Not specified 

    - Scale factor:     Not specified 

    - Precision of IR:     FP32 

    - Enable fusing:     True 

    - Enable grouped convolutions fusing:     True 

    - Move mean values to preprocess section:     False 

    - Reverse input channels:     False 

TensorFlow specific parameters: 

    - Input model in text protobuf format:     False 

    - Path to model dump for TensorBoard:     None 

    - List of shared libraries with TensorFlow custom layers implementation:     None 

    - Update the configuration file with input/output node names:     None 

    - Use configuration file used to generate the model with Object Detection API:     None 

    - Operations to offload:     None 

    - Patterns to offload:     None 

    - Use the config file:     None 

Model Optimizer version:     2020.1.0-61-gd349c3ba4a 

/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. 

  _np_qint8 = np.dtype([("qint8", np.int8, 1)]) 

/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. 

[ SUCCESS ] Generated IR version 10 model. 

[ SUCCESS ] XML file: /home/ncs1/Downloads/./bert_model.ckpt.xml 

[ SUCCESS ] BIN file: /home/ncs1/Downloads/./bert_model.ckpt.bin 

[ SUCCESS ] Total execution time: 45.84 seconds.  

[ SUCCESS ] Memory consumed: 3681 MB.

 

Regards,

Randall B.

191 Views

Hello Alexandru,

Thanks for your patience, please try the following steps to load successfully on the Intel® Neural Compute Stick 2:

  1. Turn off the VPU_HW_STAGES_OPTIMIZATION and re-test.
  2. Add the following after declaring IECore() on the demo file:

           if args.device == "MYRIAD":

                ie.set_config({'VPU_HW_STAGES_OPTIMIZATION': 'NO'}, "MYRIAD")

Best regards,

Randall B.

View solution in original post

190 Views

Hello Randall,

Indeed, disabling the VPU_HW_STAGES_OPTIMIZATION works: the model load no longer freezes (it takes a few seconds).

One question: does the problem reproduce in your case? I mean, when not disabling the the optimization option, does the model take forever to load on your NCS2? I am thinking about buying another NCS2 but I want to make sure that my device is defective first, before buying another one to find the same issue.

Thank you!

 
190 Views

Hello Alexandru,

Your Intel Neural Compute Stick 2 is working as expected, we can also see the same behavior. The default value for VPU_HW_STAGES_OPTIMIZATION works for the majority of the models. There is some model like in your case that works best with no VPU_HW_STAGES_OPTIMIZATION.
Hope this answers your question.

Regards,

Randall B.

Reply