We have been testing the HeadDetector example on https://devcloud.intel.com/edge/advanced/licensed_applications/ and are happy with the results, so we went forward and did a quick integration of it into our SDK following the example code before doing the final purchase.
And yet when we run an inference in our own application, it is taking around 750 ms while the Intel example takes only 60 ms. We noticed that Intel example uses all 6 cores (on a machine with 6 physical and 12 logical cores) but our integration uses only 1 core on the same machine, and when we launch around 100 inferences it even jumpes between cores. We have tried,
vas::hd::HeadDetector::Builder hd_builder; hd_builder.ie_config["CPU_BIND_THREAD"] = "YES"; hd_builder.ie_config["CPU_THREADS_NUM"] = "6";
Hello, updates to this problem.
When we moved the model initialization (basically running vas::hd::HeadDetector::Build()) into the same method that runs the inference on vas::hd::HeadDetector returned by the previous call, inference times got down to 6 ms from around 750 ms.
So we have came to the conclusion that our unique_ptr returned by Build() (that we keep as a class member) is actually being reset somewhere between our initialization method and the inference method, and the excessive time is caused by the model load at each inference (we measured model load times, numbers match).
I tried the code below yet it's still not working (m_ prefix is for class members)
// Inside our Init() m_hd = std::move(m_hd_builder.Build(model.c_str())); // Then inside our Inference() m_hd.Detect(...)
I even tried to get the internal vas::hd::HeadDetector pointer by get() and release() yet still no luck.
Any tips or ideas, please? Why would the model be disappearing between these two calls?