- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My test
#Generate the IR model
import openvino as ov
from pathlib import Path
from nncf import compress_weights
MODEL_NAME = "efficientnet_b0"
MODEL_DIR = Path("Models")
MODEL_DIR.mkdir(parents=True, existence_ok=True)
weights = models.EfficientNet_B0_Weights.DEFAULT
model = models.efficientnet_b0(weights=weights)
model.eval()
batch_size=32
ov_model = ov.convert_model(model, input=[[batch_size, 3, 224, 224]])
ov.save_model(ov_model, MODEL_DIR / f"{MODEL_NAME}_{batch_size}_static.xml")
quantized_model = compress_weights(ov_model)
ov.save_model(quantized_model, MODEL_DIR / f"{MODEL_NAME}_{batch_size}_quantized.xml")
#Implementation Testing
...
batch_size=32
num_workers=0
val_dataset = datasets.ImageNet(root=IMAGENET_VAL_DIR, split='val', transform=val_transforms)
val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=batch_size, shuffle=False, num_workers=num_workers, drop_last=True)
...
compiled_model = core.compile_model(model=MODEL_PATH, device_name=device)
input_layer = compiled_model.input(0)
output_layer = compiled_model.output(0)
...
for images, labels in val_loader:
inputs = images.numpy()
results = compiled_model(inputs={input_layer: inputs})
output = results[output_layer]
top1, top5 = accuracy(output, labels)
top1_total += top1
top5_total += top5
total += labels.size(0)
# Show progress bar
elapsed = time.strftime("%H:%M:%S", time.gmtime(time.time() - start_time))
num_imput = num_imput + len(images)
print(f"\rmodel: {type_model} - device : {device} - Valid [{(num_imput):>7,}/{size_test:>7,}] - Elapsed: {elapsed} - ".replace(",", "."), end="")
batch_size:32, num_workers:0
model: quantized - device : GPU - Valid [ 49.984/ 50.000] - Elapsed: 00:03:42 - accuracy - Top-1: 77.01%, Top-5: 93.24%
model: quantized - device : CPU - Valid [ 49.984/ 50.000] - Elapsed: 00:06:50 - accuracy - Top-1: 77.01%, Top-5: 93.25%
model: quantized - device : NPU - Valid [ 49.984/ 50.000] - Elapsed: 00:12:21 - accuracy - Top-1: 77.03%, Top-5: 93.23%
model: static - device : GPU - Valid [ 49.984/ 50.000] - Elapsed: 00:03:46 - accuracy - Top-1: 77.65%, Top-5: 93.58%
model: static - device : CPU - Valid [ 49.984/ 50.000] - Elapsed: 00:06:46 - accuracy - Top-1: 77.68%, Top-5: 93.58%
model: static - device : NPU - Valid [ 49.984/ 50.000] - Elapsed: 00:12:25 - accuracy - Top-1: 77.68%, Top-5: 93.58%
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Pablo_BR,
Thank you for reaching out to us.
I have run your snippet of code but I encountered TypeError: Path.mkdir() got an unexpected keyword argument 'existence_ok' while running MODEL_DIR.mkdir(parents=True, existence_ok=True), and I encountered NameError: name 'models' is not defined while I removed argument 'existence_ok' and run weights = models.EfficientNet_B0_Weights.DEFAULT
Could you please provide the information below so that we can further investigate the issue?
- Python version
- Hardware specifications
- Deep learning models
- Code snippet or Python script to replicate the issue
If you have additional information that is helpful for us, please share it here as well. We will continue to troubleshoot the issue once we received the information.
Regards,
Wan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, thank you for your interest in my question.
I'll try to answer your questions.
- Hardware specifications
GEEKOM GT1 Mega
Versión de BIOS 0.50
Fecha 2024-08-12
Sistema operativo
Microsoft Windows 11 Pro (64 bits)
Versión de Compilación 24H2 (10.0.26100)
Descripción Intel64 Family 6 Model 170 Stepping 4
Arquitectura x64
Cantidad de núcleos 16
Cantidad de subprocesos 22
Frecuencia básica del procesador 2300 MHz
Voltaje actual 1.6
Caché de nivel 2 18432 KB
Caché de nivel 3 24576 KB
Identificación de procesador 0xA06A4
Gráficos
Detalles del controlador
Proveedor Intel Corporation
Versión 32.0.101.6874
Fecha 2025-05-25
Procesador de video Intel® Arc™ Graphics Family
ID de dispositivo PCI\VEN_8086&DEV_7D55&SUBSYS_22128086&REV_08\3&11583659&0&10
Proveedor Intel® Corporation
Nombre IntcUSB.sys
Detalles del controlador
Proveedor Realtek Semiconductor Corp.
Nombre RTKVHD64.sys
Detalles del dispositivo
Memoria física - Total 32 GB
Memoria física - Disponible 23,48 GB
Memoria virtual - Total 33,84 GB
Memoria virtual - Disponible 25,07 GB
- Deep learning models
Obtain the pre-trained model with its weights from torchvision.models, models.efficientnet_b0
Create the OpenVino IR model, files:
efficientnet_b0_32_static.xml
efficientnet_b0_32_static.bin
Create the OpenVino Quantization model, files:
efficientnet_b0_32_quantized.xml
efficientnet_b0_32_quantized.bin
With these two models, validate the torchvision.datasets.ImageNet dataset and quantify the difference between implementations on the CPU, GPU, and NPU.
- Python version
(OpenVINO_env) C:\IA\OpenVINO_env\Code>pip show torch torchvision openvino
Name: torch
Version: 2.7.0+xpu
---
Name: torchvision
Version: 0.22.0+xpu
---
Name: openvino
Version: 2025.1.0
Summary: OpenVINO(TM) Runtime
- Code snippet or Python script to replicate the issue
import torch
If you have any further questions, please don't hesitate to contact me.
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Pablo_BR,
Thank you for sharing the information with us.
We will further investigate the issue and provide an update here as soon as possible.
Regards,
Wan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Pablo_BR,
Thank you for your patience.
I have downloaded and converted efficientnet_b0 model into Intermediate Representation with the following command:
import torchvision
import torch
import openvino as ov
model = torchvision.models.efficientnet_b0(weights='DEFAULT')
ov_model = ov.convert_model(model)
ov.save_model(ov_model, 'model.xml')
However, I encountered error while inferencing with NPU plugin. The snippet code worked fine while using CPU and GPU plugin.
Could you please share the following models with us to further investigate the issue? For example, you may share it via Google Drive with us.
- efficientnet_b0_32_static.xml
- efficientnet_b0_32_static.bin
- efficientnet_b0_32_quantized.xml
- efficientnet_b0_32_quantized.bin
Regards,
Wan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, happy to help.
I use this code to create the models.
Please run it to generate the model files.
# This format, which consists of an XML file for the network topology and a BIN file for the weights and biases,
# is highly optimized for efficient inference on Intel hardware
# If your model allows it, quantizing the model to INT8 can result in significant performance gains with minimal loss of accuracy.
If you have any further questions, please don't hesitate to contact me.
Thank you very much.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Pablo_BR,
Thank you for sharing the code to generate the model files with us.
We will further investigate the issue, and we will provide an update here as soon as possible.
Regards,
Wan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Pablo_BR,
Thank you for your patience.
I have downloaded and converted the efficientnet_b0 model into static and quantized Intermediate Representation. I have run both models on CPU, GPU, and NPU plugin with the Python code. I also encountered the similar issue:
CPU plugin:
GPU plugin:
NPU plugin:
I will escalate the case to relevant team, and we will provide an update here as soon as possible.
Regards,
Wan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Pablo_BR,
Thank you for your patience. We have received feedback from relevant team.
Batch is in experimental state for Intel® NPU plugin and it is not recommended to use especially in performance tests. Please decrease the number of batch to 1 on Intel® NPU plugin and try it out with different performance_hint modes (tput or latency) and different number of async infer requests in tput mode.
On another note, you may try to set batch to 2, but there is no point in running the batch more than 2 because of the HW resources limitation and the batching algorithm utilizing them.
Every topology might have a different performance ratio in different configurations, mostly because of Intel® NPU architecture and optimization applied.
Regards,
Wan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Pablo_BR,
Thank you for your question.
If you need additional information from Intel, please submit a new question as this thread will no longer be monitored.
Regards,
Wan

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page