Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.
6411 Discussions

cann't see NPU device on Intel Core Ultra CPU

Enlin
New Contributor I
2,005 Views

Hi,

I uses openvino 2023.3 API run on Windows 11 by Intel Core Ultra 

the NPU driver had installed succesfully which can see on Device Manager.

the test code is:

void query_device() {
    ov_core_t* core;
    ov_available_devices_t dev = { 0 };
    int sts = ov_core_create(&core);
    sts = ov_core_get_available_devices(core, &dev);
    if (sts == OK) {
        for (size_t i = 0; i < dev.size; i++) {
            printf("%s\n", dev.devices[i]);
        }
    }
    ov_available_devices_free(&dev);
    ov_core_free(core);
}

and the result is

---- OpenVINO INFO----
Description : OpenVINO Runtime
Build number: 2023.3.0-13775-ceeafaf64f3-releases/2023/3
CPU
GNA.GNA_SW
GNA.GNA_HW
GPU

please help why no NPU in the results ?

 

I choose "AUTO" as inference device, the NPU utlization is 0% in my inference.

 

Thanks

 

Enlin Jiang.

 

0 Kudos
13 Replies
Wan_Intel
Moderator
1,958 Views

Hello Enlin,

Thanks for reaching out to us.

 

Could you please try to run the Hello Query Device C++ Sample and share the output result with us?

 

The steps to build the sample applications on Microsoft Windows are available here:
https://docs.openvino.ai/2023.3/openvino_docs_get_started_get_started_demos.html#build-the-sample-applications

 

 

Regards,

Wan

 

0 Kudos
Enlin
New Contributor I
1,939 Views

Hi Wan, 

thanks for reply.

yes, I dit it before, no NPU found.

the output result is:

[ INFO ] Build ................................. 2023.3.0-13775-ceeafaf64f3-releases/2023/3
[ INFO ]
[ INFO ] Available devices:
[ INFO ] CPU
[ INFO ]        SUPPORTED_PROPERTIES:
[ INFO ]                Immutable: AVAILABLE_DEVICES : ""
[ INFO ]                Immutable: RANGE_FOR_ASYNC_INFER_REQUESTS : 1 1 1
[ INFO ]                Immutable: RANGE_FOR_STREAMS : 1 22
[ INFO ]                Immutable: FULL_DEVICE_NAME : Intel(R) Core(TM) Ultra 7 155H
[ INFO ]                Immutable: OPTIMIZATION_CAPABILITIES : FP32 FP16 INT8 BIN EXPORT_IMPORT
[ INFO ]                Mutable: NUM_STREAMS : 1
[ INFO ]                Mutable: AFFINITY : NONE
[ INFO ]                Mutable: INFERENCE_NUM_THREADS : 0
[ INFO ]                Mutable: PERF_COUNT : NO
[ INFO ]                Mutable: INFERENCE_PRECISION_HINT : f32
[ INFO ]                Mutable: PERFORMANCE_HINT : LATENCY
[ INFO ]                Mutable: EXECUTION_MODE_HINT : PERFORMANCE
[ INFO ]                Mutable: PERFORMANCE_HINT_NUM_REQUESTS : 0
[ INFO ]                Mutable: ENABLE_CPU_PINNING : YES
[ INFO ]                Mutable: SCHEDULING_CORE_TYPE : ANY_CORE
[ INFO ]                Mutable: ENABLE_HYPER_THREADING : YES
[ INFO ]                Mutable: DEVICE_ID : ""
[ INFO ]                Mutable: CPU_DENORMALS_OPTIMIZATION : NO
[ INFO ]                Mutable: CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE : 1
[ INFO ]
[ INFO ] GNA.GNA_SW
[ INFO ]        SUPPORTED_PROPERTIES:
[ INFO ]                Immutable: AVAILABLE_DEVICES : GNA_SW GNA_HW
[ INFO ]                Immutable: OPTIMAL_NUMBER_OF_INFER_REQUESTS : 1
[ INFO ]                Immutable: RANGE_FOR_ASYNC_INFER_REQUESTS : 1 1 1
[ INFO ]                Immutable: OPTIMIZATION_CAPABILITIES : INT16 INT8 EXPORT_IMPORT FP32
[ INFO ]                Immutable: FULL_DEVICE_NAME : GNA_SW
[ INFO ]                Immutable: GNA_LIBRARY_FULL_VERSION : 3.5.0.2116
[ INFO ]                Mutable: GNA_DEVICE_MODE : GNA_SW_EXACT
[ INFO ]                Mutable: PERFORMANCE_HINT : LATENCY
[ INFO ]                Mutable: LOG_LEVEL : LOG_NONE
[ INFO ]                Immutable: EXECUTION_DEVICES : GNA
[ INFO ]                Mutable: GNA_SCALE_FACTOR_PER_INPUT : ""
[ INFO ]                Mutable: GNA_FIRMWARE_MODEL_IMAGE : ""
[ INFO ]                Mutable: GNA_HW_EXECUTION_TARGET : UNDEFINED
[ INFO ]                Mutable: GNA_HW_COMPILE_TARGET : UNDEFINED
[ INFO ]                Mutable: GNA_PWL_DESIGN_ALGORITHM : UNDEFINED
[ INFO ]                Mutable: GNA_PWL_MAX_ERROR_PERCENT : 1.000000
[ INFO ]                Mutable: INFERENCE_PRECISION_HINT : undefined
[ INFO ]                Mutable: EXECUTION_MODE_HINT : ACCURACY
[ INFO ]                Mutable: PERFORMANCE_HINT_NUM_REQUESTS : 1
[ INFO ]
[ INFO ] GNA.GNA_HW
[ INFO ]        SUPPORTED_PROPERTIES:
[ INFO ]                Immutable: AVAILABLE_DEVICES : GNA_SW GNA_HW
[ INFO ]                Immutable: OPTIMAL_NUMBER_OF_INFER_REQUESTS : 1
[ INFO ]                Immutable: RANGE_FOR_ASYNC_INFER_REQUESTS : 1 1 1
[ INFO ]                Immutable: OPTIMIZATION_CAPABILITIES : INT16 INT8 EXPORT_IMPORT
[ INFO ]                Immutable: FULL_DEVICE_NAME : GNA_HW
[ INFO ]                Immutable: GNA_LIBRARY_FULL_VERSION : 3.5.0.2116
[ INFO ]                Mutable: GNA_DEVICE_MODE : GNA_SW_EXACT
[ INFO ]                Mutable: PERFORMANCE_HINT : LATENCY
[ INFO ]                Mutable: LOG_LEVEL : LOG_NONE
[ INFO ]                Immutable: EXECUTION_DEVICES : GNA
[ INFO ]                Mutable: GNA_SCALE_FACTOR_PER_INPUT : ""
[ INFO ]                Mutable: GNA_FIRMWARE_MODEL_IMAGE : ""
[ INFO ]                Mutable: GNA_HW_EXECUTION_TARGET : UNDEFINED
[ INFO ]                Mutable: GNA_HW_COMPILE_TARGET : UNDEFINED
[ INFO ]                Mutable: GNA_PWL_DESIGN_ALGORITHM : UNDEFINED
[ INFO ]                Mutable: GNA_PWL_MAX_ERROR_PERCENT : 1.000000
[ INFO ]                Mutable: INFERENCE_PRECISION_HINT : undefined
[ INFO ]                Mutable: EXECUTION_MODE_HINT : ACCURACY
[ INFO ]                Mutable: PERFORMANCE_HINT_NUM_REQUESTS : 1
[ INFO ]
[ INFO ] GPU
[ INFO ]        SUPPORTED_PROPERTIES:
[ INFO ]                Immutable: AVAILABLE_DEVICES : 0
[ INFO ]                Immutable: RANGE_FOR_ASYNC_INFER_REQUESTS : 1 2 1
[ INFO ]                Immutable: RANGE_FOR_STREAMS : 1 2
[ INFO ]                Immutable: OPTIMAL_BATCH_SIZE : 1
[ INFO ]                Immutable: MAX_BATCH_SIZE : 1
[ INFO ]                Immutable: DEVICE_ARCHITECTURE : GPU: vendor=0x8086 arch=v785.192.4
[ INFO ]                Immutable: FULL_DEVICE_NAME : Intel(R) Arc(TM) Graphics (iGPU)
[ INFO ]                Immutable: DEVICE_UUID : 8680557d080000000002000000000000
[ INFO ]                Immutable: DEVICE_LUID : 3b07010000000000
[ INFO ]                Immutable: DEVICE_TYPE : integrated
[ INFO ]                Immutable: DEVICE_GOPS : {f16:9216,f32:4608,i8:18432,u8:18432}
[ INFO ]                Immutable: OPTIMIZATION_CAPABILITIES : FP32 BIN FP16 INT8 EXPORT_IMPORT
[ INFO ]                Immutable: GPU_DEVICE_TOTAL_MEM_SIZE : 6733217792
[ INFO ]                Immutable: GPU_UARCH_VERSION : 785.192.4
[ INFO ]                Immutable: GPU_EXECUTION_UNITS_COUNT : 128
[ INFO ]                Immutable: GPU_MEMORY_STATISTICS : ""
[ INFO ]                Mutable: PERF_COUNT : NO
[ INFO ]                Mutable: MODEL_PRIORITY : MEDIUM
[ INFO ]                Mutable: GPU_HOST_TASK_PRIORITY : MEDIUM
[ INFO ]                Mutable: GPU_QUEUE_PRIORITY : MEDIUM
[ INFO ]                Mutable: GPU_QUEUE_THROTTLE : MEDIUM
[ INFO ]                Mutable: GPU_ENABLE_LOOP_UNROLLING : YES
[ INFO ]                Mutable: GPU_DISABLE_WINOGRAD_CONVOLUTION : NO
[ INFO ]                Mutable: CACHE_DIR : ""
[ INFO ]                Mutable: CACHE_MODE : optimize_speed
[ INFO ]                Mutable: PERFORMANCE_HINT : LATENCY
[ INFO ]                Mutable: EXECUTION_MODE_HINT : PERFORMANCE
[ INFO ]                Mutable: COMPILATION_NUM_THREADS : 22
[ INFO ]                Mutable: NUM_STREAMS : 1
[ INFO ]                Mutable: PERFORMANCE_HINT_NUM_REQUESTS : 0
[ INFO ]                Mutable: INFERENCE_PRECISION_HINT : f16
[ INFO ]                Mutable: ENABLE_CPU_PINNING : NO
[ INFO ]                Mutable: DEVICE_ID : 0
[ INFO ]

Best Regards.

 

Enlin Jiang.

0 Kudos
Enlin
New Contributor I
1,938 Views

Hi Wan,

thanks for reply, I dit it before, but no NPU found.

the output result is:

 

[ INFO ] Build ................................. 2023.3.0-13775-ceeafaf64f3-releases/2023/3
[ INFO ]
[ INFO ] Available devices:
[ INFO ] CPU
[ INFO ]        SUPPORTED_PROPERTIES:
[ INFO ]                Immutable: AVAILABLE_DEVICES : ""
[ INFO ]                Immutable: RANGE_FOR_ASYNC_INFER_REQUESTS : 1 1 1
[ INFO ]                Immutable: RANGE_FOR_STREAMS : 1 22
[ INFO ]                Immutable: FULL_DEVICE_NAME : Intel(R) Core(TM) Ultra 7 155H
[ INFO ]                Immutable: OPTIMIZATION_CAPABILITIES : FP32 FP16 INT8 BIN EXPORT_IMPORT
[ INFO ]                Mutable: NUM_STREAMS : 1
[ INFO ]                Mutable: AFFINITY : NONE
[ INFO ]                Mutable: INFERENCE_NUM_THREADS : 0
[ INFO ]                Mutable: PERF_COUNT : NO
[ INFO ]                Mutable: INFERENCE_PRECISION_HINT : f32
[ INFO ]                Mutable: PERFORMANCE_HINT : LATENCY
[ INFO ]                Mutable: EXECUTION_MODE_HINT : PERFORMANCE
[ INFO ]                Mutable: PERFORMANCE_HINT_NUM_REQUESTS : 0
[ INFO ]                Mutable: ENABLE_CPU_PINNING : YES
[ INFO ]                Mutable: SCHEDULING_CORE_TYPE : ANY_CORE
[ INFO ]                Mutable: ENABLE_HYPER_THREADING : YES
[ INFO ]                Mutable: DEVICE_ID : ""
[ INFO ]                Mutable: CPU_DENORMALS_OPTIMIZATION : NO
[ INFO ]                Mutable: CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE : 1
[ INFO ]
[ INFO ] GNA.GNA_SW
[ INFO ]        SUPPORTED_PROPERTIES:
[ INFO ]                Immutable: AVAILABLE_DEVICES : GNA_SW GNA_HW
[ INFO ]                Immutable: OPTIMAL_NUMBER_OF_INFER_REQUESTS : 1
[ INFO ]                Immutable: RANGE_FOR_ASYNC_INFER_REQUESTS : 1 1 1
[ INFO ]                Immutable: OPTIMIZATION_CAPABILITIES : INT16 INT8 EXPORT_IMPORT FP32
[ INFO ]                Immutable: FULL_DEVICE_NAME : GNA_SW
[ INFO ]                Immutable: GNA_LIBRARY_FULL_VERSION : 3.5.0.2116
[ INFO ]                Mutable: GNA_DEVICE_MODE : GNA_SW_EXACT
[ INFO ]                Mutable: PERFORMANCE_HINT : LATENCY
[ INFO ]                Mutable: LOG_LEVEL : LOG_NONE
[ INFO ]                Immutable: EXECUTION_DEVICES : GNA
[ INFO ]                Mutable: GNA_SCALE_FACTOR_PER_INPUT : ""
[ INFO ]                Mutable: GNA_FIRMWARE_MODEL_IMAGE : ""
[ INFO ]                Mutable: GNA_HW_EXECUTION_TARGET : UNDEFINED
[ INFO ]                Mutable: GNA_HW_COMPILE_TARGET : UNDEFINED
[ INFO ]                Mutable: GNA_PWL_DESIGN_ALGORITHM : UNDEFINED
[ INFO ]                Mutable: GNA_PWL_MAX_ERROR_PERCENT : 1.000000
[ INFO ]                Mutable: INFERENCE_PRECISION_HINT : undefined
[ INFO ]                Mutable: EXECUTION_MODE_HINT : ACCURACY
[ INFO ]                Mutable: PERFORMANCE_HINT_NUM_REQUESTS : 1
[ INFO ]
[ INFO ] GNA.GNA_HW
[ INFO ]        SUPPORTED_PROPERTIES:
[ INFO ]                Immutable: AVAILABLE_DEVICES : GNA_SW GNA_HW
[ INFO ]                Immutable: OPTIMAL_NUMBER_OF_INFER_REQUESTS : 1
[ INFO ]                Immutable: RANGE_FOR_ASYNC_INFER_REQUESTS : 1 1 1
[ INFO ]                Immutable: OPTIMIZATION_CAPABILITIES : INT16 INT8 EXPORT_IMPORT
[ INFO ]                Immutable: FULL_DEVICE_NAME : GNA_HW
[ INFO ]                Immutable: GNA_LIBRARY_FULL_VERSION : 3.5.0.2116
[ INFO ]                Mutable: GNA_DEVICE_MODE : GNA_SW_EXACT
[ INFO ]                Mutable: PERFORMANCE_HINT : LATENCY
[ INFO ]                Mutable: LOG_LEVEL : LOG_NONE
[ INFO ]                Immutable: EXECUTION_DEVICES : GNA
[ INFO ]                Mutable: GNA_SCALE_FACTOR_PER_INPUT : ""
[ INFO ]                Mutable: GNA_FIRMWARE_MODEL_IMAGE : ""
[ INFO ]                Mutable: GNA_HW_EXECUTION_TARGET : UNDEFINED
[ INFO ]                Mutable: GNA_HW_COMPILE_TARGET : UNDEFINED
[ INFO ]                Mutable: GNA_PWL_DESIGN_ALGORITHM : UNDEFINED
[ INFO ]                Mutable: GNA_PWL_MAX_ERROR_PERCENT : 1.000000
[ INFO ]                Mutable: INFERENCE_PRECISION_HINT : undefined
[ INFO ]                Mutable: EXECUTION_MODE_HINT : ACCURACY
[ INFO ]                Mutable: PERFORMANCE_HINT_NUM_REQUESTS : 1
[ INFO ]
[ INFO ] GPU
[ INFO ]        SUPPORTED_PROPERTIES:
[ INFO ]                Immutable: AVAILABLE_DEVICES : 0
[ INFO ]                Immutable: RANGE_FOR_ASYNC_INFER_REQUESTS : 1 2 1
[ INFO ]                Immutable: RANGE_FOR_STREAMS : 1 2
[ INFO ]                Immutable: OPTIMAL_BATCH_SIZE : 1
[ INFO ]                Immutable: MAX_BATCH_SIZE : 1
[ INFO ]                Immutable: DEVICE_ARCHITECTURE : GPU: vendor=0x8086 arch=v785.192.4
[ INFO ]                Immutable: FULL_DEVICE_NAME : Intel(R) Arc(TM) Graphics (iGPU)
[ INFO ]                Immutable: DEVICE_UUID : 8680557d080000000002000000000000
[ INFO ]                Immutable: DEVICE_LUID : 3b07010000000000
[ INFO ]                Immutable: DEVICE_TYPE : integrated
[ INFO ]                Immutable: DEVICE_GOPS : {f16:9216,f32:4608,i8:18432,u8:18432}
[ INFO ]                Immutable: OPTIMIZATION_CAPABILITIES : FP32 BIN FP16 INT8 EXPORT_IMPORT
[ INFO ]                Immutable: GPU_DEVICE_TOTAL_MEM_SIZE : 6733217792
[ INFO ]                Immutable: GPU_UARCH_VERSION : 785.192.4
[ INFO ]                Immutable: GPU_EXECUTION_UNITS_COUNT : 128
[ INFO ]                Immutable: GPU_MEMORY_STATISTICS : ""
[ INFO ]                Mutable: PERF_COUNT : NO
[ INFO ]                Mutable: MODEL_PRIORITY : MEDIUM
[ INFO ]                Mutable: GPU_HOST_TASK_PRIORITY : MEDIUM
[ INFO ]                Mutable: GPU_QUEUE_PRIORITY : MEDIUM
[ INFO ]                Mutable: GPU_QUEUE_THROTTLE : MEDIUM
[ INFO ]                Mutable: GPU_ENABLE_LOOP_UNROLLING : YES
[ INFO ]                Mutable: GPU_DISABLE_WINOGRAD_CONVOLUTION : NO
[ INFO ]                Mutable: CACHE_DIR : ""
[ INFO ]                Mutable: CACHE_MODE : optimize_speed
[ INFO ]                Mutable: PERFORMANCE_HINT : LATENCY
[ INFO ]                Mutable: EXECUTION_MODE_HINT : PERFORMANCE
[ INFO ]                Mutable: COMPILATION_NUM_THREADS : 22
[ INFO ]                Mutable: NUM_STREAMS : 1
[ INFO ]                Mutable: PERFORMANCE_HINT_NUM_REQUESTS : 0
[ INFO ]                Mutable: INFERENCE_PRECISION_HINT : f16
[ INFO ]                Mutable: ENABLE_CPU_PINNING : NO
[ INFO ]                Mutable: DEVICE_ID : 0
0 Kudos
Wan_Intel
Moderator
1,932 Views

Hello Enlin,

Thanks for the information.

 

Could you please follow the steps as follow and see if the issue can be resolved?

 

  1. Uninstall Intel® NPU Driver from Device Manager and uninstall OpenVINO™ Runtime on Windows.
  2. Download and install the latest Intel® NPU Driver for Windows.
  3. Download and install the latest OpenVINO™ Runtime on Windows from an Archive File.
  4. Update environment variables by running the setupvars.bat batch file before compiling and running OpenVINO™ applications.
  5. Run Hello Query Device C++ Sample to check available devices.

 

 

Regards,

Wan

 

0 Kudos
Enlin
New Contributor I
1,887 Views

Hi Wan,

thanks for reply.

I did the steps, but still noNPU device found.

the driver information is:

Enlin_0-1709705933274.png

Enlin_1-1709705995257.png

and the openvino version is:

 2023.3.0-13775-ceeafaf64f3-releases/2023/3

no any API error in hello_query_device\main.cpp

please help,

thanks

 

Enlin Jiang

 

 

0 Kudos
Enlin
New Contributor I
1,875 Views

Hi Wan,

after I change the configuration from "Debug" to "Release", the NPU device is show:

---- OpenVINO INFO----
Description : OpenVINO Runtime
Build number: 2023.3.0-13775-ceeafaf64f3-releases/2023/3
CPU
GNA.GNA_SW
GNA.GNA_HW
GPU
NPU
[INFO] model name: torch-jit-export
[INFO] model name: main_graph

the "Debug"  configuration link to "openvino_cd.lib", is it some issue in this lib?

furthermore, when I run my inference model in frame capture loop, the NPU utlization rate is still 0%.

Enlin_0-1709714642545.png

I use "AUTO" as device name.

how to utlize NPU proper?

 

thanks.

Enlin Jiang.

 

Ray_Lo_Intel
Employee
1,856 Views

https://medium.com/openvino-toolkit/how-to-run-and-develop-your-ai-app-on-intel-npu-intel-ai-boost-76f3efade169

 

I made a few quick observations and summaries on how to make it work on Python. May shine some lights here?

0 Kudos
Enlin
New Contributor I
1,827 Views

Hi Ray,

thanks for reply.

I change my device name from "AUTO" to "NPU", I did see the NPU utlization rate grate then 0%. but the total performance drop from 60 FPS to 10 FPS, even poor then device "CPU".

my test model is "face-detection-0204" from Intel Model Zoo.

 

Regards.

 

Enlin Jiang. 

0 Kudos
Lukas_
Beginner
848 Views

Hi Enlin and Wan,

 

---Here is my system information---

Window Version:   Win11, 24H2, 26100.1

OPENVINO Version: 2024.0.0

Python Version:     3.11.9

Visual Studio Version: 2022 

Processor: Intel(R) Core(TM) Ultra 7 155H     (Driver Version: 10.0.26100.1

NPU:     Intel(R) AI Boost             (Driver Version: 32.0.100.2267

iGPU:    Intel(R) Arc(TM) Graphics         (Driver Version: 31.0.101.5234

dGPU:      NVIDIA GeForce RTX 4070 Laptop GPU (Driver Version: 31.0.15.4626

 

In C++, I encountered a similar issue to yours when working with Intel's OpenVINO.

 

In the hello_query_device project, when the solution configuration is set to "Debug", the NPU device fails to be enumerated. However, switching the solution configuration to Release resolves this issue, the NPU device is correctly listed. (see as below)

I found that the devices enumerated by hello_query_device include CPU, GPU.0, GPU.1, GPU.2, and GPU.3. However, both GPU.0 and GPU.1 have a FULL_DEVICE_NAME of "Intel(R) Arc(TM) Graphics (iGPU)", while GPU.2 and GPU.3 are identified as "NVIDIA GeForce RTX 4070 Laptop GPU (dGPU)".

 

I don't understand why the same device is listed twice, albeit with slightly different details. Could someone explain which device, GPU.0 or GPU.1, I should choose for inference? Additionally, could someone explain why the NPU can only be used for inference when the solution configurations are set to Release? Thank you.
(You can reach out to me either by replying here or by emailing me directly at Lukas.Cheng@acer.com

 

Btw, in the classification_sample_async project, the NPU can only be utilized for inference when the solution configurations are set to Release.

Screenshot 2024-04-15 093945.png

 

Best wishes,

Lukas

 

Result of hello_query_device project:

 

[ INFO ] Build ................................. 2024.0.0-14509-34caeefd078-releases/2024/0
[ INFO ]
[ INFO ] Available devices:
[ INFO ] CPU
[ INFO ]        SUPPORTED_PROPERTIES:
[ INFO ]                Immutable: AVAILABLE_DEVICES : ""
[ INFO ]                Immutable: RANGE_FOR_ASYNC_INFER_REQUESTS : 1 1 1
[ INFO ]                Immutable: RANGE_FOR_STREAMS : 1 22
[ INFO ]                Immutable: EXECUTION_DEVICES : CPU
[ INFO ]                Immutable: FULL_DEVICE_NAME : Intel(R) Core(TM) Ultra 7 155H
[ INFO ]                Immutable: OPTIMIZATION_CAPABILITIES : FP32 FP16 INT8 BIN EXPORT_IMPORT
[ INFO ]                Immutable: DEVICE_TYPE : integrated
[ INFO ]                Immutable: DEVICE_ARCHITECTURE : intel64
[ INFO ]                Mutable: NUM_STREAMS : 1
[ INFO ]                Mutable: AFFINITY : HYBRID_AWARE
[ INFO ]                Mutable: INFERENCE_NUM_THREADS : 0
[ INFO ]                Mutable: PERF_COUNT : NO
[ INFO ]                Mutable: INFERENCE_PRECISION_HINT : f32
[ INFO ]                Mutable: PERFORMANCE_HINT : LATENCY
[ INFO ]                Mutable: EXECUTION_MODE_HINT : PERFORMANCE
[ INFO ]                Mutable: PERFORMANCE_HINT_NUM_REQUESTS : 0
[ INFO ]                Mutable: ENABLE_CPU_PINNING : YES
[ INFO ]                Mutable: SCHEDULING_CORE_TYPE : ANY_CORE
[ INFO ]                Mutable: ENABLE_HYPER_THREADING : YES
[ INFO ]                Mutable: DEVICE_ID : ""
[ INFO ]                Mutable: CPU_DENORMALS_OPTIMIZATION : NO
[ INFO ]                Mutable: LOG_LEVEL : LOG_NONE
[ INFO ]                Mutable: CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE : 1
[ INFO ]                Mutable: DYNAMIC_QUANTIZATION_GROUP_SIZE : 0
[ INFO ]                Mutable: KV_CACHE_PRECISION : f16
[ INFO ]
[ INFO ] GPU.0
[ INFO ]        SUPPORTED_PROPERTIES:
[ INFO ]                Immutable: AVAILABLE_DEVICES : 0 1 2 3
[ INFO ]                Immutable: RANGE_FOR_ASYNC_INFER_REQUESTS : 1 2 1
[ INFO ]                Immutable: RANGE_FOR_STREAMS : 1 2
[ INFO ]                Immutable: OPTIMAL_BATCH_SIZE : 1
[ INFO ]                Immutable: MAX_BATCH_SIZE : 1
[ INFO ]                Immutable: DEVICE_ARCHITECTURE : GPU: vendor=0x8086 arch=v785.192.4
[ INFO ]                Immutable: FULL_DEVICE_NAME : Intel(R) Arc(TM) Graphics (iGPU)
[ INFO ]                Immutable: DEVICE_UUID : 8680557d080000000002000000000000
[ INFO ]                Immutable: DEVICE_LUID : 3920010000000000
[ INFO ]                Immutable: DEVICE_TYPE : integrated
[ INFO ]                Immutable: DEVICE_GOPS : {f16:9216,f32:4608,i8:18432,u8:18432}
[ INFO ]                Immutable: OPTIMIZATION_CAPABILITIES : FP32 BIN FP16 INT8 EXPORT_IMPORT
[ INFO ]                Immutable: GPU_DEVICE_TOTAL_MEM_SIZE : 15514271744
[ INFO ]                Immutable: GPU_UARCH_VERSION : 785.192.4
[ INFO ]                Immutable: GPU_EXECUTION_UNITS_COUNT : 128
[ INFO ]                Immutable: GPU_MEMORY_STATISTICS : ""
[ INFO ]                Mutable: PERF_COUNT : NO
[ INFO ]                Mutable: MODEL_PRIORITY : MEDIUM
[ INFO ]                Mutable: GPU_HOST_TASK_PRIORITY : MEDIUM
[ INFO ]                Mutable: GPU_QUEUE_PRIORITY : MEDIUM
[ INFO ]                Mutable: GPU_QUEUE_THROTTLE : MEDIUM
[ INFO ]                Mutable: GPU_ENABLE_LOOP_UNROLLING : YES
[ INFO ]                Mutable: GPU_DISABLE_WINOGRAD_CONVOLUTION : NO
[ INFO ]                Mutable: CACHE_DIR : ""
[ INFO ]                Mutable: CACHE_MODE : optimize_speed
[ INFO ]                Mutable: PERFORMANCE_HINT : LATENCY
[ INFO ]                Mutable: EXECUTION_MODE_HINT : PERFORMANCE
[ INFO ]                Mutable: COMPILATION_NUM_THREADS : 22
[ INFO ]                Mutable: NUM_STREAMS : 1
[ INFO ]                Mutable: PERFORMANCE_HINT_NUM_REQUESTS : 0
[ INFO ]                Mutable: INFERENCE_PRECISION_HINT : f16
[ INFO ]                Mutable: ENABLE_CPU_PINNING : NO
[ INFO ]                Mutable: DEVICE_ID : 0
[ INFO ]
[ INFO ] GPU.1
[ INFO ]        SUPPORTED_PROPERTIES:
[ INFO ]                Immutable: AVAILABLE_DEVICES : 0 1 2 3
[ INFO ]                Immutable: RANGE_FOR_ASYNC_INFER_REQUESTS : 1 2 1
[ INFO ]                Immutable: RANGE_FOR_STREAMS : 1 2
[ INFO ]                Immutable: OPTIMAL_BATCH_SIZE : 1
[ INFO ]                Immutable: MAX_BATCH_SIZE : 1
[ INFO ]                Immutable: DEVICE_ARCHITECTURE : GPU: vendor=0x8086 arch=Intel(R) Arc(TM) Graphics
[ INFO ]                Immutable: FULL_DEVICE_NAME : Intel(R) Arc(TM) Graphics (iGPU)
[ INFO ]                Immutable: DEVICE_UUID : 00000000000000000000000000000000
[ INFO ]                Immutable: DEVICE_LUID : 0000000000000000
[ INFO ]                Immutable: DEVICE_TYPE : integrated
[ INFO ]                Immutable: DEVICE_GOPS : {f16:0.384,f32:0.192,i8:0.192,u8:0.192}
[ INFO ]                Immutable: OPTIMIZATION_CAPABILITIES : FP32 BIN EXPORT_IMPORT
[ INFO ]                Immutable: GPU_DEVICE_TOTAL_MEM_SIZE : 17006522368
[ INFO ]                Immutable: GPU_UARCH_VERSION : unknown
[ INFO ]                Immutable: GPU_EXECUTION_UNITS_COUNT : 1
[ INFO ]                Immutable: GPU_MEMORY_STATISTICS : ""
[ INFO ]                Mutable: PERF_COUNT : NO
[ INFO ]                Mutable: MODEL_PRIORITY : MEDIUM
[ INFO ]                Mutable: GPU_HOST_TASK_PRIORITY : MEDIUM
[ INFO ]                Mutable: GPU_QUEUE_PRIORITY : MEDIUM
[ INFO ]                Mutable: GPU_QUEUE_THROTTLE : MEDIUM
[ INFO ]                Mutable: GPU_ENABLE_LOOP_UNROLLING : YES
[ INFO ]                Mutable: GPU_DISABLE_WINOGRAD_CONVOLUTION : NO
[ INFO ]                Mutable: CACHE_DIR : ""
[ INFO ]                Mutable: CACHE_MODE : optimize_speed
[ INFO ]                Mutable: PERFORMANCE_HINT : LATENCY
[ INFO ]                Mutable: EXECUTION_MODE_HINT : PERFORMANCE
[ INFO ]                Mutable: COMPILATION_NUM_THREADS : 22
[ INFO ]                Mutable: NUM_STREAMS : 1
[ INFO ]                Mutable: PERFORMANCE_HINT_NUM_REQUESTS : 0
[ INFO ]                Mutable: INFERENCE_PRECISION_HINT : f16
[ INFO ]                Mutable: ENABLE_CPU_PINNING : NO
[ INFO ]                Mutable: DEVICE_ID : 1
[ INFO ]
[ INFO ] GPU.2
[ INFO ]        SUPPORTED_PROPERTIES:
[ INFO ]                Immutable: AVAILABLE_DEVICES : 0 1 2 3
[ INFO ]                Immutable: RANGE_FOR_ASYNC_INFER_REQUESTS : 1 2 1
[ INFO ]                Immutable: RANGE_FOR_STREAMS : 1 2
[ INFO ]                Immutable: OPTIMAL_BATCH_SIZE : 1
[ INFO ]                Immutable: MAX_BATCH_SIZE : 1
[ INFO ]                Immutable: DEVICE_ARCHITECTURE : GPU: vendor=0x10de arch=v8.9.0
[ INFO ]                Immutable: FULL_DEVICE_NAME : NVIDIA GeForce RTX 4070 Laptop GPU (dGPU)
[ INFO ]                Immutable: DEVICE_UUID : f931a879cc01e8c85b84029813f95fdc
[ INFO ]                Immutable: DEVICE_LUID : ba27010000000000
[ INFO ]                Immutable: DEVICE_TYPE : discrete
[ INFO ]                Immutable: DEVICE_GOPS : {f16:0,f32:0,i8:0,u8:0}
[ INFO ]                Immutable: OPTIMIZATION_CAPABILITIES : FP32 BIN INT8 EXPORT_IMPORT
[ INFO ]                Immutable: GPU_DEVICE_TOTAL_MEM_SIZE : 8585216000
[ INFO ]                Immutable: GPU_UARCH_VERSION : 8.9.0
[ INFO ]                Immutable: GPU_EXECUTION_UNITS_COUNT : 36
[ INFO ]                Immutable: GPU_MEMORY_STATISTICS : ""
[ INFO ]                Mutable: PERF_COUNT : NO
[ INFO ]                Mutable: MODEL_PRIORITY : MEDIUM
[ INFO ]                Mutable: GPU_HOST_TASK_PRIORITY : MEDIUM
[ INFO ]                Mutable: GPU_QUEUE_PRIORITY : MEDIUM
[ INFO ]                Mutable: GPU_QUEUE_THROTTLE : MEDIUM
[ INFO ]                Mutable: GPU_ENABLE_LOOP_UNROLLING : YES
[ INFO ]                Mutable: GPU_DISABLE_WINOGRAD_CONVOLUTION : NO
[ INFO ]                Mutable: CACHE_DIR : ""
[ INFO ]                Mutable: CACHE_MODE : optimize_speed
[ INFO ]                Mutable: PERFORMANCE_HINT : LATENCY
[ INFO ]                Mutable: EXECUTION_MODE_HINT : PERFORMANCE
[ INFO ]                Mutable: COMPILATION_NUM_THREADS : 22
[ INFO ]                Mutable: NUM_STREAMS : 1
[ INFO ]                Mutable: PERFORMANCE_HINT_NUM_REQUESTS : 0
[ INFO ]                Mutable: INFERENCE_PRECISION_HINT : f16
[ INFO ]                Mutable: ENABLE_CPU_PINNING : NO
[ INFO ]                Mutable: DEVICE_ID : 2
[ INFO ]
[ INFO ] GPU.3
[ INFO ]        SUPPORTED_PROPERTIES:
[ INFO ]                Immutable: AVAILABLE_DEVICES : 0 1 2 3
[ INFO ]                Immutable: RANGE_FOR_ASYNC_INFER_REQUESTS : 1 2 1
[ INFO ]                Immutable: RANGE_FOR_STREAMS : 1 2
[ INFO ]                Immutable: OPTIMAL_BATCH_SIZE : 1
[ INFO ]                Immutable: MAX_BATCH_SIZE : 1
[ INFO ]                Immutable: DEVICE_ARCHITECTURE : GPU: vendor=0x10de arch=NVIDIA GeForce RTX 4070 Laptop GPU
[ INFO ]                Immutable: FULL_DEVICE_NAME : NVIDIA GeForce RTX 4070 Laptop GPU (dGPU)
[ INFO ]                Immutable: DEVICE_UUID : 00000000000000000000000000000000
[ INFO ]                Immutable: DEVICE_LUID : 0000000000000000
[ INFO ]                Immutable: DEVICE_TYPE : discrete
[ INFO ]                Immutable: DEVICE_GOPS : {f16:0,f32:0,i8:0,u8:0}
[ INFO ]                Immutable: OPTIMIZATION_CAPABILITIES : FP32 BIN INT8 EXPORT_IMPORT
[ INFO ]                Immutable: GPU_DEVICE_TOTAL_MEM_SIZE : 8349810688
[ INFO ]                Immutable: GPU_UARCH_VERSION : unknown
[ INFO ]                Immutable: GPU_EXECUTION_UNITS_COUNT : 1
[ INFO ]                Immutable: GPU_MEMORY_STATISTICS : ""
[ INFO ]                Mutable: PERF_COUNT : NO
[ INFO ]                Mutable: MODEL_PRIORITY : MEDIUM
[ INFO ]                Mutable: GPU_HOST_TASK_PRIORITY : MEDIUM
[ INFO ]                Mutable: GPU_QUEUE_PRIORITY : MEDIUM
[ INFO ]                Mutable: GPU_QUEUE_THROTTLE : MEDIUM
[ INFO ]                Mutable: GPU_ENABLE_LOOP_UNROLLING : YES
[ INFO ]                Mutable: GPU_DISABLE_WINOGRAD_CONVOLUTION : NO
[ INFO ]                Mutable: CACHE_DIR : ""
[ INFO ]                Mutable: CACHE_MODE : optimize_speed
[ INFO ]                Mutable: PERFORMANCE_HINT : LATENCY
[ INFO ]                Mutable: EXECUTION_MODE_HINT : PERFORMANCE
[ INFO ]                Mutable: COMPILATION_NUM_THREADS : 22
[ INFO ]                Mutable: NUM_STREAMS : 1
[ INFO ]                Mutable: PERFORMANCE_HINT_NUM_REQUESTS : 0
[ INFO ]                Mutable: INFERENCE_PRECISION_HINT : f16
[ INFO ]                Mutable: ENABLE_CPU_PINNING : NO
[ INFO ]                Mutable: DEVICE_ID : 3
[ INFO ]
[ INFO ] NPU
[ INFO ]        SUPPORTED_PROPERTIES:
[ INFO ]                Immutable: AVAILABLE_DEVICES : 3720
[ INFO ]                Immutable: CACHE_DIR : ""
[ INFO ]                Immutable: CACHING_PROPERTIES : DEVICE_ARCHITECTURE NPU_COMPILATION_MODE_PARAMS NPU_DMA_ENGINES NPU_DPU_GROUPS NPU_COMPILATION_MODE NPU_DRIVER_VERSION NPU_COMPILER_TYPE NPU_USE_ELF_COMPILER_BACKEND
[ INFO ]                Immutable: DEVICE_ARCHITECTURE : 3720
[ INFO ]                Mutable: DEVICE_ID : ""
[ INFO ]                Immutable: DEVICE_UUID : 80d1d11eb73811eab3de0242ac130004
[ INFO ]                Mutable: ENABLE_CPU_PINNING : NO
[ INFO ]                Mutable: EXCLUSIVE_ASYNC_REQUESTS : NO
[ INFO ]                Immutable: FULL_DEVICE_NAME : Intel(R) AI Boost
[ INFO ]                Immutable: INTERNAL_SUPPORTED_PROPERTIES : CACHING_PROPERTIES
[ INFO ]                Mutable: LOG_LEVEL : LOG_NONE
[ INFO ]                Mutable: MODEL_PRIORITY : MEDIUM
[ INFO ]                Immutable: NPU_BACKEND_NAME : LEVEL0
[ INFO ]                Mutable: NPU_COMPILATION_MODE : ""
[ INFO ]                Mutable: NPU_COMPILATION_MODE_PARAMS : ""
[ INFO ]                Mutable: NPU_COMPILER_TYPE : DRIVER
[ INFO ]                Immutable: NPU_DEVICE_ALLOC_MEM_SIZE : 0
[ INFO ]                Immutable: NPU_DEVICE_TOTAL_MEM_SIZE : 33554432
[ INFO ]                Mutable: NPU_DMA_ENGINES : -1
[ INFO ]                Mutable: NPU_DPU_GROUPS : -1
[ INFO ]                Immutable: NPU_DRIVER_VERSION : 2267
[ INFO ]                Mutable: NPU_MAX_TILES : -1
[ INFO ]                Mutable: NPU_PLATFORM : AUTO_DETECT
[ INFO ]                Mutable: NPU_PRINT_PROFILING : NONE
[ INFO ]                Mutable: NPU_PROFILING_OUTPUT_FILE : ""
[ INFO ]                Mutable: NPU_PROFILING_TYPE : MODEL
[ INFO ]                Mutable: NPU_STEPPING : -1
[ INFO ]                Mutable: NPU_USE_ELF_COMPILER_BACKEND : AUTO
[ INFO ]                Immutable: NUM_STREAMS : 1
[ INFO ]                Immutable: OPTIMAL_NUMBER_OF_INFER_REQUESTS : 1
[ INFO ]                Immutable: OPTIMIZATION_CAPABILITIES : FP16 INT8 EXPORT_IMPORT
[ INFO ]                Mutable: PERFORMANCE_HINT : LATENCY
[ INFO ]                Mutable: PERFORMANCE_HINT_NUM_REQUESTS : 1
[ INFO ]                Mutable: PERF_COUNT : NO
[ INFO ]                Immutable: RANGE_FOR_ASYNC_INFER_REQUESTS : 1 10 1
[ INFO ]                Immutable: RANGE_FOR_STREAMS : 1 4
[ INFO ]

 

 

0 Kudos
Wan_Intel
Moderator
1,732 Views

Hi Enlin,

Thanks for the information.

 

We'll further investigate the issue and update you as soon as possible. On the other hand, could you please use the following command to infer FP16  face-detection-0204 with Benchmark C++ Tool and share the output results with us?

 

To use the Benchmark C++ Tool, please follow the Build the Sample Applications instructions and set up paths and environment variables by following the Get Ready for Running the Sample Applications instructions.

 

 

Regards,

Wan

 

 

0 Kudos
Enlin
New Contributor I
1,600 Views

Hi Wan,

To inference face-detection-0204 FP16 in CPP version, I set device to "AUTO", no NPU utilization but work fine.

set device to "NPU" I get error:

[ INFO ] Build ................................. 2024.0.0-14509-34caeefd078-releases/2024/0
[ INFO ]
[ INFO ] Available devices:
[ INFO ] CPU
[ INFO ] GPU
[ INFO ] NPU
Exception from src\inference\src\cpp\infer_request.cpp:223:
Exception from C:\Jenkins\workspace\private-ci\ie\build-windows-vs2019@2\b\repos\npu\src\zero_backend\include\zero_utils.h:25:
L0 zeCommandQueueExecuteCommandLists result: ZE_RESULT_ERROR_DEVICE_LOST, code 0x70000001 - device hung, reset, was removed, or driver update occurred

 

Thanks.

 

Enlin

0 Kudos
Wan_Intel
Moderator
1,389 Views

Hi Enlin,

Thanks for your patience. We received feedback from relevant team.

 

The performance of NPU and its ratio to the performance of CPU is not fixed, and it highly depends on the model itself. There are theoretical limits where the performance can be achieved on NPU HW, and this estimation is currently made for NPU POR models only. face-detection-0204 is not a POR model, so we have no such estimate. You can expect better performance in upcoming releases as our OpenVINO™ developers are working on optimization.

 

 

Regards,

Wan

 

0 Kudos
Wan_Intel
Moderator
1,145 Views

Hello Enlin,

Thanks for your question.

 

If you need any additional information from Intel, please submit a new question as this thread will no longer be monitored.

 

 

Regards,

Wan

 

0 Kudos
Reply