Error Message for a converted Kaldi Model in OpenVino(Version 2021.4)

hsin_ntn · ‎08-26-2021

Hello,

I'm trying to run a converted Kaldi model in OpenVino.
I use the Librispeech model provided by the documentation webpage.

Documentation webpage.
The link of the model mentioned in the documentation.

The conversion process did well, but I encountered the error message when the OpenVino app (offline_speech_recognition_app.exe) comes to inference.

[ERROR] Sample supports only topologies with I input
Failed to initialize speech library. Status: -5

I have tried a lot of solutions but still cannot deal with these problems.
Is there any solution to this problem?

I will be grateful for any help you can provide.
Hsin

Wan_Intel · ‎08-29-2021

Hi Hsin_ntn,

We are investigating this issue and will update you at the earliest.

Regards,

Wan

Wan_Intel · ‎08-31-2021

Hi Hsin_ntn,

I encountered the same error as you did when I ran Offline Speech Recognition Demo with Librispeech nnet3 as an input model.

This is because Offline Speech Recognition Demo supports only topologies with one input. I have checked the inputs of Librispeech nnet3 using Netron. It has the following inputs:

· input

· ivector

For your information, Offline Speech Recognition Demo supports lspeech_s5_ext model, an example of pre-trained LibriSpeech DNN. You may download the lspeech_s5_ext model by executing the following script:

On Windows OS: <INSTALL_DIR>\deployment_tools\demo\demo_speech_recognition.bat
On Linux OS:<INSTALL_DIR>/deployment_tools/demo/demo_speech_recognition.sh

On another note, steps to run Speech Recognition Demos with Pre-trained Models is available at the following page:

https://docs.openvinotoolkit.org/2021.4/openvino_inference_engine_samples_speech_libs_and_demos_Speech_libs_and_demos.html#run-demos

Regards,

Wan

hsin_ntn · ‎09-07-2021

Hi Wan,

In fact, I face the same problem when I use my model.
I really want to try to inference my model on OpenVino.
Is it possible to provide a Kaldi recipe for the lspeech_s5_ext model to let me know how to retraining the model (how to concat input feature for OpenVino feature extractor) or another solution to solve this problem?

Thank you for considering my request.
Hsin

Wan_Intel · ‎09-08-2021

Hi Hsin_ntn,

Thanks for your information.

The Model Optimizer supports the nnet1 and nnet2 formats of Kaldi models. Support of the nnet3 format is limited. Please refer to here for more information.

On the other hand, we will forward your request to the development team.

Regards,

Wan

Wan_Intel · ‎09-13-2021

Hi Hsin_ntn,

Thanks for your patience.

From the OpenVINO perspective, we don’t have any specific model retraining module for the Kaldi model.

But, we suggest you explore some insights from the Kaldi repo on feature extraction and discussion on Kaldi model retraining.

From our observation, we notice that the model you used (Librispeech nnet3 and your custom model) has 2 inputs that can be run using the Automatic Speech Recognition C++ sample.

However, you need to perform the conversion using the steps provided here. In step 3, you need to download the Librispeech ASR model to meet the requirement.

Note, Kaldi Aspire Chain Time Delay Neural Network (TDNN) model has 2 inputs which the steps above should be worked for both models.

Regards,

Wan

Wan_Intel · ‎09-23-2021

Hi Hsin_ntn,

This thread will no longer be monitored since we have provided a solution.

If you need any additional information from Intel, please submit a new question.

Regards,

Wan