Using openvino toolkit made inference slower than onnx runtime

Egor94 · ‎05-16-2021

Hello, I used DeepPavlov rubert model https://huggingface.co/DeepPavlov/rubert-base-cased-conversational in pytorch implementation and tried to optimize it with openvino toolkit. I transformed model to onnx format using following script:

dummy_input = torch.tensor([[0]*64])

symbolic_names = {0:batach_size, 1: max_seq_len}

torch.onn.export(model, dummy_input, path_to_save, opset_vestion=11, do_constant_folding=True, input_names=['input_ids'], output_names=['class'], dynamic_axes={'input_ids':symbolic_names', 'class':[0,1]})

After getting onnx file I used model optimizer:

python mo.py -m model_path --input_shape [1,64]

I got ir files and checked inference with network.infer() and it was two times slower than onnx inference.

What did I do wrong?

Thank you.

IntelSupport · ‎05-18-2021

Hi Egor94,

Thanks for reaching out. Model Optimizer can produce an IR with different precision. Which precision that you tested? Meanwhile, can you test the model on OpenVINO benchmark_app and see the performance of the model.

Regards,

Aznie

IntelSupport · ‎06-03-2021

Hi Egor94,

Thank you for your question. If you need any additional information from Intel, please submit a new question as this thread is no longer being monitored.

Regards,

Aznie