Solved: Re: LSTM CudnnRNNV3 Translation, Workaround Produces Biased Predictions

Xanph · ‎10-01-2025

With thanks to @Peh_Intel for his help in debugging so far, I'm continuing this issue from Deprecation of Tensorflow CudnnRNNV3 on conversion to IR Format

My Development System Info (training and converting model to IR):
OS: Ubuntu 24.04.3 LTS
Kernel: 6.14.0-32-generic

Dependencies
OpenVINO Version: 2025.3.0-19807-44526285f24-releases/2025/3
Keras: 3.8.0
Tensorflow: 2.16.1
NNCF: 2.18.0
Numpy: 1.26.4

NVIDIA driver: 550.163.01
CUDA: 12.3.107
cuDNN: 8.9.7

System
CPU: AMD Ryzen 9 5950X 16-Core Processor
GPUs: 1x NVIDIA RTX3070, 1x NVIDIA RTX 3060Ti

Production System Info (inference use):
OS: Ubuntu 24.04.3 LTS
Kernel: 6.8.0-84-generic

(Running inside of a docker container, base image Ubuntu 24.04 latest)

Docker image contains gstreamer, OpenCV built from source, pygobject, cython, pycario, Intel OpenCL ICD (just to note a few).

Dependencies
OpenVINO Version: 2025.3.0-19807-44526285f24-releases/2025/3

Intel driver: i915

System
CPU: Intel(R) Xeon(R) E-2124 CPU @ 3.30GHz
GPUs: 1x Intel ARC A580 rev08

---

Summary

Root Problem
My .keras binary classification model uses a bidirectional LSTM layer

Bidirectional(LSTM(32, return_sequences=True))

which when converted, produces a "no translator found for operation(s): CudnnRNNV3" internal error. Back when I created issue 'Deprecation of Tensorflow CudnnRNNV3 on conversion to IR Format', the solution was to downgrade to TF 2.16.1. Since then (September 2025), TF 2.16.1 now uses CudnnRNNV3 operations that fail conversion, whereas previously it worked.

To test, I added:

Bidirectional(LSTM(32, return_sequences=True, recurrent_dropout=0.1))

to the LSTM layer to avoid the use of cuDNN. Model conversion to IR then worked, but would then produce predictions on the production system that have heavy bias to 0.0001.

Model conversion code used:

model_name = f"{version}_best_model"
model_path = f'models/best_keras/{model_name}.keras'
model = keras.models.load_model(model_path)

# Save the model in the SavedModel format
saved_model_dir = f'models/tensorflow/{model_name}_tf'
model.export(saved_model_dir, format='tf_saved_model')

# Convert the SavedModel to OpenVINO IR format with multiple outputs
ir_model = ov.convert_model(
saved_model_dir,
)

# Save the converted IR model
output_dir = "models/intermediate_representation"
ov.save_model(ir_model, f"{output_dir}/ir_{version}.xml")

Just sharing an observation, I wonder if Tensorflow's inclusion of CUDA in the pip package is causing this problem, that's all I think has changed. I have re-installed all dependencies and attempted different versions.

Secondary Problem
Speaking with Peh, sharing code and model files, he converted the non-recurrent dropout keras model with CPU only. Testing this on the production system the predictions are also heavily one sided toward a class - all values at 0.9+. However, when I tested the model on CPU only, the predictions were accurate.

I did the same test on the model with recurrent dropout and it was the same result.

Peh suggested to add:

ov_config = {"GPU_DISABLE_WINOGRAD_CONVOLUTION": "YES"}
compile_model = core.compile_model(“model.xml”, “GPU”,ov_config)

in to the production system's code.

With this, the converted model Peh provided did produce better predictions, but without the accuracy I'm expecting - predictions range from 0.1964 to 0.5981 with a mean of 0.4245 and standard deviation 0.1119.

Expected predictions would be a range of 0.0004 to 0.998 with the mean around 0.2279 and deviation at 0.2778.

I tested this code with the recurrent dropout model and still experienced the same prediction bias (toward 0.0001).

---

So to summarise those two problems

I still get cudnnRNNV3 errors when not using recurrent dropout.
I'd currently have to rely on Peh to convert my model (or use recurrent_dropout), and that converted model has weak accuracy or no convergence.

From a few months back (26th March 2025), when I created the first issue, I do have a working IR model, with successful optimisation. This model was created from a .h5 version, and would have used OpenVINO v2024.3.0.

I'm happy to provide model and code directly.

With thanks.

Best regards,
Xanph

Peh_Intel · ‎11-17-2025

Hi Xanph,

The fixes have included in OpenVINO 2026.0.0 which is expected to be available in 2026. Currently, please use the OpenVINO nightly build (2026.0.0). With this OpenVINO version, running with GPU gives the same results as running with CPU.

pip install --pre -U openvino --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly

Regards,

Peh

View solution in original post

Peh_Intel · ‎10-01-2025

Hi Xanph,

Thanks for your detailed description of the issue. Yes, please do share the models and codes also for better investigation. You can send those files to me privately if you don’t expose them publicly.

Regards,

Peh

Xanph · ‎10-02-2025

Hello Peh,

Thanks, all files remain the same as the ones I sent privately in our existing conversation.

You mentioned about some further investigation needed from the dev team?

Best regards,

Xanph

Peh_Intel · ‎10-02-2025

Hi Xanph,

I only have your models only. If you able to provide your inferencing script in justifying the predictions would be great.

Regards,

Peh

Xanph · ‎10-03-2025

No problem Peh,

Inference files sent ️

Best regards,

Flynn.

Peh_Intel · ‎10-14-2025

Hi Xanph,

I have received your shared files. We will investigate this matter further and get back to you at the earliest.

Regards,

Peh

Xanph · ‎10-15-2025

With thanks to you and the team,

Xanph

Xanph · ‎11-11-2025

Hi @Peh_Intel and team,

How is this looking?

Best regards,

Xanph

Peh_Intel · ‎11-11-2025

Hi Xanph,

After some modification, the results are obtained as below. The results from GPU show better accuracy than before but yet not identical to the results from CPU. Please justify whether the obtained results are satisfying.

Results (peh_converted_model) on CPU.

Results (peh_converted_model) on GPU.

Results (motionX8.4.0) on CPU.

Results (motionX8.4.0) on GPU.

Regards,

Peh

Xanph · ‎11-13-2025

Hi Peh,

Thank you for those screenshots.

I'm afraid it's not looking like what I'm expecting, still seems to show loss of convergence. Additionally inference duration looks odd.

Best regards,

Xanph

Peh_Intel · ‎11-17-2025

Hi Xanph,

The fixes have included in OpenVINO 2026.0.0 which is expected to be available in 2026. Currently, please use the OpenVINO nightly build (2026.0.0). With this OpenVINO version, running with GPU gives the same results as running with CPU.

pip install --pre -U openvino --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly

Regards,

Peh

Xanph · ‎11-18-2025

Thank you very much Peh,

I'm looking forward to trying this out!

Best regards,

Xanph

Peh_Intel · ‎11-23-2025

Hi Xanph,

How has your testing been so far?

Regards,

Peh

Xanph · ‎11-24-2025

Hi Peh,

Hope you're doing good. I'm pretty busy this week, but just ran a quick test. Looks like it's doing good!

I'll have more time to thoroughly test next week, so I'll get back properly then and to mark a solution.

Do you mind if I ask what the root problem and fix was (is there a git commit id)? Does the fix intend to resolve the CuDNN RNNv3 error or do I still need to avoid that?

Have a great week

Many thanks and best regards,

Xanph

Peh_Intel · ‎11-24-2025

Hi Xanph,

No worries, take your time.

Here is the pull request to fix the dynamic accuracy issue when inferencing on GPU.

Regards,

Peh

Xanph · ‎12-01-2025

Thank you for your help and time Peh and team,

That nightly release version works well after running for a few days, so I marked it as the solution!

Is this now in release 25.4.0?

Many thanks,

Xanph

Peh_Intel · ‎12-01-2025

Hi Xanph,

No, the fixes only included in OpenVINO 2026.0.0 which is expected to be available in 2026.

Glad to know the nightly build works well for you. As such, I am going to close this case.

This thread will no longer be monitored since this issue has been resolved. If you need any additional information from Intel, please submit a new question.

Regards,

Peh