Re:Does OpenVINO support inference of Bert?

Hap_Zhang · ‎08-15-2021

Hello,

I have a Bert model, and the inference is slow in CPU, does OpenVINO support the inference of Bert, and accelerate it? My CPU is 40 Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz, thanks a lot!

Munesh_Intel · ‎08-17-2021

Hi Hap Zhang,

Thanks for reaching out to us.

Intel® Xeon® Silver 4114 Processor (13.75M Cache, 2.20 GHz) is supported by OpenVINO, since it’s a 1st generation Intel® Xeon® Scalable processor. The system requirements for Intel® Distribution of OpenVINO™ toolkit are available at System Requirements.

Performance wise, the bert-large-uncased-whole-word-masking-squad-int8-0001 model (which is a model quantized to INT8 precision using quantization-aware fine-tuning with NNCF) executed on Intel® Xeon® Silver 4216R CPU @ 2.20GHz achieved 16.71 FPS, as can be seen in the benchmark page.

For your additional information, OpenVINO supports the following Bidirectional Encoder Representations from Transformers (BERT) models, as mentioned here.

· BERT-Base, Cased

· BERT-Base, Uncased

· BERT-Base, Multilingual Cased

· BERT-Base, Multilingual Uncased

· BERT-Base, Chinese

· BERT-Large, Cased

· BERT-Large, Uncased

OpenVINO also provides the following seven Intel pre-trained models and one public pre-trained model:

· bert-large-uncased-whole-word-masking-squad-0001

· bert-large-uncased-whole-word-masking-squad-emb-0001

· bert-large-uncased-whole-word-masking-squad-int8-0001

· bert-small-uncased-whole-word-masking-squad-0001

· bert-small-uncased-whole-word-masking-squad-0002

· bert-small-uncased-whole-word-masking-squad-emb-int8-0001

· bert-small-uncased-whole-word-masking-squad-int8-0002

· bert-base-ner

Regards,

Munesh

Hap_Zhang · ‎08-22-2021

Hi, Munesh

Thank you very much for your quickly reply, and i will have a try later.

Munesh_Intel · ‎09-01-2021

Hi Hap Zhang,

This thread will no longer be monitored since we have provided a solution. If you need any additional information from Intel, please submit a new question.

Regards,

Munesh