Re: Re:noise-suppression-poconetlike-0001

ndzhizhin · ‎08-19-2024

Hello!
I am currently trying to convert a model from PyTorch to OpenVINO. My PyTorch model is for audio signal processing, so I need to use STFT and ISTFT as this is a common approach in audio processing. But while converting from PyTorch to OpenVINO I get this Summary:
-- No conversion rule found for operations: aten::istft, aten::stft
-- Conversion failed for: prim::Constant
Your Demo noise-suppression-poconetlike-0001 from open_model_zoo uses STFT and ISTFT, also in the paper "PoCoNet: Better Speech Enhancement with Frequency-Positional Embeddings, Semi-Supervised Conversational Data, and Biased Loss" said that "The neural model N takes as input the STFT of the reverberant and noisy example s ∗ h + n and estimates the complex ratio mask that would give the target signal estimate as: yb = ISTFT(N (STFT(x)) · STFT(x))."
So, my question is: how do you use STFT and ISTFT in your demo inside the model? How do you add such operations to the OpenVINO graph?
I hope you can answer my question or tell me about workaround.
Hope to hear from you soon!

Aznie_Intel · ‎08-19-2024

Hi Ndzhinzhin,

Thanks for reaching out. We are checking on this and will get back to you once the information is available.

Regards,

Aznie

ndzhizhin · ‎08-26-2024

Hi!
Are there any updates?
Regards,
Nikita

Witold_Intel · ‎08-27-2024

Hi @ndzhizhin

Here are some recommendations from our developers.

1. In which part of model are ISTFT and STFT required to be used and how many them in total? If it is in the beginning or end of the model, they can be separated to be executed in preprocessing or postprocessing until official support in provided on OpenVino side.

2. Another recommendation to try to use ONNX as intermediate step to convert model in OpenVino (ONNX frontend has support for limited case of STFT for static shapes case only, but possibly it will be enough)

3. Try to use STFT and ISTFT implementations from https://github.com/qiuqiangkong/torchlibrosa instead of original pytorch (they are represented as decompositions with convolutions layers)

Would those resolve your issues?

Aznie_Intel · ‎09-09-2024

Hi Ndzhizhin,

Thank you for your question. If you need any additional information from Intel, please submit a new question as this thread is no longer being monitored.

Regards,

Aznie

demo noise-suppression-poconetlike-0001 use STFT and ISTFT inside the OpenVINO graph