NNCF post-training quantization - What is wrong?

LFan · ‎06-18-2025

Hello,

I am trying NNCF post-training quantization (PTQ) on a U-net model, hoping to improve the inference speed. The neural network is originally trained in PyTorch, converted to OpenVINO, and then quantized using NNCF using PTQ.

Below is the steps and code snippet for PTQ part:

######################################################################

# Post-Training Quantization (PTQ) with NNCF

import nncf

# Step 1: Define a transformation function for calibration

def transform_fn(data_item

inputs, _, _ = data_item # Extract inputs from the data loader

if bSingleInputTensor:

# Ensure the input tensor matches the model's expected format

return inputs[0].unsqueeze(1).float().numpy() # Add channel dimension, ensure float32

else:

# Ensure both input tensors match the model's expected format

return {

0: inputs[0].unsqueeze(1).float().numpy(), # First input tensor

1: inputs[1].unsqueeze(1).float().numpy() # Second input tensor

}

# Step 2: Create an NNCF Dataset using joinedDataLoader

calibration_dataset = nncf.Dataset(joinedDataLoader, transform_fn)

# Step 3: Perform quantization

quantized_model = nncf.quantize(ov_unet_model, calibration_dataset)

print("Quantization completed successfully.")

# Step 4: Save the quantized model using the new function

self.WriteQuantizedOpenVINOModelsForDeployment(quantized_model, bWriteInfoFile=True)

The PTQ shows that only ~7% of the model was quantized.

When comparing the inference runtime between Torch, OpenVINO, and OpenVINO Quantized models, the quantized model turns out to be ~10% slower than the original OpenVINO model.

Please let me know if you need anything else.

If possible, I would greatly appreciate if we can schedule a meeting to discuss this issue.

Thanks,
Li

Zulkifli_Intel · ‎06-19-2025

Hi LFan,

Thank you for reaching out.

Please share the required files/scripts to reproduce the issue. Can you share the necessary files/scripts for the case 06612870 as well? I'm checking if scheduling a meeting is possible.

Regards,

Zul

LFan · ‎06-20-2025

Thank you for the reply! It would be great if we can talk directly.

Can we share files that are Intel confidential in this forum? It looks like anyone can download the attachments... Is there a more confidential way to share files?

Thanks for the help.

n_scott_pearson · ‎06-20-2025

OMG, NO! This is an open forum. No Confidential information should be shared here. You should only share Confidential information with Intel folks and external folks that you know have a CNDA in place.
The Community site offers a Private Messaging service that you can use for secured communications. Just click on your avatar picture and select 'Messages'.
Just saying,
...S

LFan · ‎06-20-2025

Hi Scott, Thanks for the tip! This is good to know. I have not shared anything confidential here yet but was wondering what's the right way to share.

Hi Zul:

Just to confirm, are you from Intel? The developers who you will pass the files to are also Intel employees, right? I could not seem to find you by searching your icon name in the Messages. What's the name I should search for?

Maybe we can exchange Intel emails to share the files?

Thanks for the help.

Zulkifli_Intel · ‎06-23-2025

Hi LFan.

Yes, the developers are from Intel. Here is my email address:

zulkiflix.bin.abdul.halim@intel.com

Regards,

Zul

Zulkifli_Intel · ‎07-30-2025

Thank you for your question. If you need any additional information from Intel, please submit a new question as this thread is no longer being monitored.