OpenVINO with 2 CPUs

Summer · ‎01-11-2022

I have implemented the inference with OpenVINO in one CPU as normal in my laptop. All the cores has been utilized when the inference is running with OpenVINO.
And then, I have tried to infer on the computer with two CPUs. Only half 20 cores have been utilized(40 cores totally in 2 CPUs) when the inference is done with OpenVINO. And all the 40 cores have been utilized when the inference is done with tensorflow.
For the high CPU utilization, I tried to search is there any solution can infer with OpenVINO on 2 CPUs. And I found here that multi devices or multi instances is supported. And I have tried according to this Myriad example below(copied from here

Beyond the trivial “CPU”, “GPU”, “HDDL” and so on, when multiple instances of a device are available the names are more qualified. For example, this is how two Intel® Movidius™ Myriad™ X sticks are listed with the hello_query_sample:

...
    Device: MYRIAD.1.2-ma2480
...
    Device: MYRIAD.1.4-ma2480

So the explicit configuration to use both would be “MULTI:MYRIAD.1.2-ma2480,MYRIAD.1.4-ma2480”. Accordingly, the code that loops over all available devices of “MYRIAD” type only is below:

InferenceEngine::Core ie;
auto cnnNetwork = ie.ReadNetwork("sample.xml");
std::string allDevices = "MULTI:";
std::vector<std::string> myriadDevices = ie.GetMetric("MYRIAD", METRIC_KEY(AVAILABLE_DEVICES));
for (size_t i = 0; i < myriadDevices.size(); ++i) {
    allDevices += std::string("MYRIAD.")
                            + myriadDevices[i]
                            + std::string(i < (myriadDevices.size() -1) ? "," : "");
}
InferenceEngine::ExecutableNetwork exeNetwork = ie.LoadNetwork(cnnNetwork, allDevices, {});

Unfortunatly, the second CPU cannot be found with GetMetric, even with GetAvailableDevices API.

And I also have tried with the hello_query_device.exe and list cpu info command to dump CPU info.

Here is the result of hello_query_device.exe

Loading Inference Engine
Available devices:
CPU
        SUPPORTED_METRICS:
                AVAILABLE_DEVICES : [  ]
                FULL_DEVICE_NAME : Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
                OPTIMIZATION_CAPABILITIES : [ FP32 FP16 INT8 BIN ]
                RANGE_FOR_ASYNC_INFER_REQUESTS : { 1, 1, 1 }
                RANGE_FOR_STREAMS : { 1, 32 }
        SUPPORTED_CONFIG_KEYS (default values):
                CPU_BIND_THREAD : NUMA
                CPU_THREADS_NUM : 0
                CPU_THROUGHPUT_STREAMS : 1
                DUMP_EXEC_GRAPH_AS_DOT : ""
                DYN_BATCH_ENABLED : NO
                DYN_BATCH_LIMIT : 0
                ENFORCE_BF16 : NO
                EXCLUSIVE_ASYNC_REQUESTS : NO
                PERF_COUNT : NO

Here is the result of wmic:root\cli>cpu list full result

AddressWidth=64
Architecture=9
Availability=3
Caption=Intel64 Family 6 Model 79 Stepping 1
ConfigManagerErrorCode=
ConfigManagerUserConfig=
CpuStatus=1
CreationClassName=Win32_Processor
CurrentClockSpeed=2101
CurrentVoltage=7
DataWidth=64
Description=Intel64 Family 6 Model 79 Stepping 1
DeviceID=CPU0
ErrorCleared=
ErrorDescription=
ExtClock=100
Family=179
InstallDate=
L2CacheSize=2048
L2CacheSpeed=
LastErrorCode=
Level=6
LoadPercentage=20
Manufacturer=GenuineIntel
MaxClockSpeed=2101
Name=Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
OtherFamilyDescription=
PNPDeviceID=
PowerManagementCapabilities=
PowerManagementSupported=FALSE
ProcessorId=BFEBFBFF000406F1
ProcessorType=3
Revision=20225
Role=CPU
SocketDesignation=CPU0
Status=OK
StatusInfo=3
Stepping=
SystemCreationClassName=Win32_ComputerSystem
SystemName=xxx
UniqueId=
UpgradeMethod=43
Version=
VoltageCaps=


AddressWidth=64
Architecture=9
Availability=3
Caption=Intel64 Family 6 Model 79 Stepping 1
ConfigManagerErrorCode=
ConfigManagerUserConfig=
CpuStatus=1
CreationClassName=Win32_Processor
CurrentClockSpeed=2101
CurrentVoltage=7
DataWidth=64
Description=Intel64 Family 6 Model 79 Stepping 1
DeviceID=CPU1
ErrorCleared=
ErrorDescription=
ExtClock=100
Family=179
InstallDate=
L2CacheSize=2048
L2CacheSpeed=
LastErrorCode=
Level=6
LoadPercentage=4
Manufacturer=GenuineIntel
MaxClockSpeed=2101
Name=Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
OtherFamilyDescription=
PNPDeviceID=
PowerManagementCapabilities=
PowerManagementSupported=FALSE
ProcessorId=BFEBFBFF000406F1
ProcessorType=3
Revision=20225
Role=CPU
SocketDesignation=CPU1
Status=OK
StatusInfo=3
Stepping=
SystemCreationClassName=Win32_ComputerSystem
SystemName=xxx
UniqueId=
UpgradeMethod=43
Version=
VoltageCaps=

My question is how to infer with OpenVINO on 2 CPUs. Any help would be appreciated.

Iffa_Intel · ‎01-12-2022

Greetings,

The optimum way to use the Multi-plugin with multiple devices is by configuring the individual devices and creating the Multi-Device on top.

For example:

CPU1_config = {}

CPU2_config = {}

ie.set_config(config=CPU1_config, device_name="CPU1 NAME")

ie.set_config(config=CPU2_config, device_name="CPU2 NAME")

# Load the network to the multi-device, specifying the priorities

exec_net = ie.load_network(

network=net, device_name="MULTI", config={"MULTI_DEVICE_PRIORITIES": CPU1 NAME,CPU2 NAME"}

)

# Query the optimal number of requests

nireq = exec_net.get_metric("OPTIMAL_NUMBER_OF_INFER_REQUESTS")

It is recomended to use -nireq 10. This should give maximum FPS. You may refer to the object_detection_demo.py to see how this -nireq argument is used.

The proper command to infer should be:

python object_detection_demo.py i 0 -d MULTI:CPU1 NAME,CPU2 NAME -m yolov4.xml -at yolo -nireq 10

This GitHub discussion might also help.

Note: the -at parameter depends on your model's type. use yolo if you are using yolo model.

Sincerely,

Iffa

Iffa_Intel · ‎01-19-2022

Greetings,

Intel will no longer monitor this thread since we have provided a solution. If you need any additional information from Intel, please submit a new question.

Sincerely,

Iffa

Iffa_Intel · ‎01-24-2022

Hi @Summer ,

We received your feedback that you were unable to post your solution in the Community.

We have reopened this case, and appreciate it if you could share your solution for the benefit of our Community.

If you are still unable to post in this thread, do reach out to us by posting a new question.

Sincerely,

Iffa

Summer · ‎05-08-2022

The solution is that use async inference. All 2 CPUs will be occupied if the inference is asynchronously.