Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.
6403 Discussions

Huggingface Transformer Conversion Instructions

sbsky
Beginner
3,021 Views

In the model zoo I see that there are BERT transformer models successfully converted from the Huggingface transformer library to OpenVINO:

https://docs.openvinotoolkit.org/2021.1/omz_models_intel_bert_small_uncased_whole_word_masking_squad_0001_description_bert_small_uncased_whole_word_masking_squad_0001.html

Unfortunately I can't find any instructions or documentation on how this conversion from the Huggingface transformer library to OpenVINO was performed. Where can I find the instructions?

0 Kudos
1 Solution
Adli
Moderator
2,778 Views

Hi Pieter,

 

This case had been escalated to the developer team and we got a recommendation provided to take a look at the set of scripts from training_extensions to fine this BERT model. Please refer to the following link:

https://github.com/openvinotoolkit/training_extensions/tree/develop/pytorch_toolkit/question_answering

 

They produce the ONNX model automatically. After that, you only need to execute MO and get the IR model.

 

Regards,

Adli

 

View solution in original post

0 Kudos
14 Replies
IntelSupport
Moderator
2,998 Views

Hi Pieter,

 

Thank you for reaching out to us. 'Bert-small-uncased-whole-word-masking-squad-0001' model is an Intel Pre-Trained model that is available for download. To download the model in IR format, please run the following command:

 

python downloader.py --name bert-small-uncased-whole-word-masking-squad-0001

 

 

You can also download the model IR files at the following link:

https://download.01.org/opencv/2021/openvinotoolkit/2021.1/open_model_zoo/models_bin/2/bert-small-uncased-whole-word-masking-squad-0001/

 

For more information regarding the model downloader, please refer to the following link:

https://docs.openvinotoolkit.org/latest/omz_tools_downloader_README.html#model_downloader_usage

 

Regards,

Adli

 

0 Kudos
sbsky
Beginner
2,991 Views

Hi Adli, thanks for your response. My question isn't about using the example model, but rather how it was created, as the example model isn't suitable for my task. I need to train it on my own data.

Was Tensorflow or Pytorch used to create the model? And if so, what version? What commands were used to perform the conversion? Was ONNX used?

0 Kudos
IntelSupport
Moderator
2,977 Views

Hi Pieter,

 

Since this is Intel Pre-Trained model, all the information available publicly from our side for this model is presented on the page you have pointed out already: https://docs.openvinotoolkit.org/latest/omz_models_intel_bert_small_uncased_whole_word_masking_squad_0001_description_bert_small_uncased_whole_word_masking_squad_0001.html

 

The original 'bert-large-uncased-whole-word-masking-finetuned-squad' model is taken from Transformers library: https://github.com/huggingface/transformers

 

The source framework is PyTorch. The model is trained on the 'SQuAD v1.1' dataset, which you can replace with your own dataset. Since there is no direct PyTorch conversion in the OpenVINO toolkit, we utilize intermediate conversion to ONNX.


For IR conversion command example, please refer the following code:

python3 mo.py -m bert_squad_fp32.onnx --input_shape "[1,384],[1,384],[1,384]" --input "0,1,2"

 

Regards,

Adli


0 Kudos
sbsky
Beginner
2,972 Views

Hi Adli, thanks I was able to convert using your instructions. I had a look at the converted openvino XML graph and I saw that Gelu and LayerNorm fusion wasn't performed. It is my understanding that the Model Optimizer should perform these graph fusions automatically.

How do I make use of these fused operators? Are there any special commands I need to give to the Model Optimizer?

0 Kudos
IntelSupport
Moderator
2,951 Views

Hi Pieter,

 

Thank you for reaching out to us. Optimization offers methods to accelerate inference with the convolution neural networks (CNN) that do not require model retraining. In the Model Optimizer, this optimization is turned on by default. This optimization method consists of three stages:

  1. BatchNormalization and ScaleShift decomposition.
  2. Linear operations merge.
  3. Linear operations fusion.

 

For more information regarding optimization description, please refer to the following link:

https://docs.openvinotoolkit.org/2021.1/openvino_docs_MO_DG_prepare_model_Model_Optimization_Techniques.html#optimization_description

 

There are cases where optimization might not operate such as:


In addition, we can trigger Model Optimizer to disable optimizations for specified nodes via -- finegrain_fusing command. For more information, please refer to the following link:

https://docs.openvinotoolkit.org/2021.1/openvino_docs_MO_DG_prepare_model_Model_Optimization_Techniques.html#disable_fusing

 

Regards,

Adli


0 Kudos
sbsky
Beginner
2,907 Views

Hi Adli, how were Gelu and LayerNorm fusion performed in the reference BERT model? The reference model features these fusions.

0 Kudos
IntelSupport
Moderator
2,882 Views

Hi Pieter,

 

If possible, could you run the following command on your model:

benchmark_app -m <your_model>.xml -report_type detailed_counters

The 'benchmark_app.exe' is located in 'inference_engine_samples_build\intel64\Release' directory. Please share and post the outcome here. For more information regarding Benchmark C++ Tool, please refer to the following link: https://docs.openvinotoolkit.org/latest/openvino_inference_engine_samples_benchmark_app_README.html#run_the_tool

 

Regards,

Adli

 

0 Kudos
sbsky
Beginner
2,868 Views

Hi Adli, I can do you one better. Please find attached an ONNX file that contains Dense, Gelu, Dense, LayerNorm layers.

0 Kudos
sbsky
Beginner
2,844 Views

Hi Aldi, just wondering if there's any update on this?

0 Kudos
IntelSupport
Moderator
2,838 Views

Hi Pieter,

 

I've been able to convert the 'onnx' model over to IR, and there are no Gelu nor LayerNorm left in IR. It has been checked thru Netron as well as in >benchmark_app -m <your_model>.xml -report_type detailed_counters.

 

Please check the model attached and please verify if that resolves the issue.

 

Regards,

Adli

 

0 Kudos
sbsky
Beginner
2,825 Views

Hi Adli, my whole point and why I am asking for assistance is because there should be Gelu and LayerNorm  in the converted IR. ONNX doesn't have Gelu or LayerNorm operations, and so expresses these as a number of other operations, such as ERF. This is what you are seeing in the converted IR.

OpenVINO should recognise these patterns and substitute in the Gelu and LayerNorm operations, but doesn't. My question is how do I do this?

I know that this is possible because the BERT examples in the OpenVINO zoo have successfully replaced these operators (you can check in Netron).

Can you please escalate this with the team that made the BERT example in the OpenVINO model zoo, or advise how I get in contact with them?

Without the fused operators, the model performance is severely impacted and no faster than Pytorch.

0 Kudos
Adli
Moderator
2,779 Views

Hi Pieter,

 

This case had been escalated to the developer team and we got a recommendation provided to take a look at the set of scripts from training_extensions to fine this BERT model. Please refer to the following link:

https://github.com/openvinotoolkit/training_extensions/tree/develop/pytorch_toolkit/question_answering

 

They produce the ONNX model automatically. After that, you only need to execute MO and get the IR model.

 

Regards,

Adli

 

0 Kudos
Adli
Moderator
2,727 Views

Hi Pieter,


This thread will no longer be monitored since we have provided a solution. If you need any additional information from Intel, please submit a new question.


Regards,

Adli


0 Kudos
sbsky
Beginner
2,722 Views

Hi Adli, after a bit of tinkering based on what's in those scripts I got it to work. Thanks for your assistance

0 Kudos
Reply