Authors: Sneha Chattopadhyay (Intel) and Markus Zarbock (Starwit Technologies)
About Starwit Technology
Starwit Technologies was founded to build software products that helps optimizing and modernizing traffic in cities. Founded by three ex-Volkswagen engineers, our hypothesis is that more and accurate real time data will help cities to solve many of their traffic problems. It will also help to get more mobility from the same amount of vehicles. The main sensor to collect data are optical cameras, for they are the most flexible and universal way to observe reality. However, image processing still is a very hard computational problem – especially if tight budget and competence constraints needs to be observed.
The Challenge
Almost all of today’s AI powered algorithms to detect and track objects are optimized to run fast on GPUs. However, in the smart city domain servers and industry PCs with GPUs are creating challenges from a number of perspectives – be it tight city budgets, energy consumption or heat dissipation in embedded scenarios.
Running necessary image processing sufficiently fast enough on pure CPU machines will put our software products on a great competitive advantage.
Building the Solution
Our core product – Starwit Awareness Engine – is working on latest GPU hardware. In order to prove that a pure CPU approach is realistic, a proving ground was necessary. So we did a series of measurement runs on the Intel® Developer Cloud using Intel’s fastest Xeon microprocessor. With the help of Intel engineers and Intel® Extensions for PyTorch* (IPEX) powered by oneAPI, object detection (using YOLOv8) was measured with various optimizations. See deep dive for detailed measurement results.
From a business perspective results were very encouraging and with our two pilot cities Carmel, Indiana (USA) and Wolfsburg (Germany), Intel hardware and Intel AI Tools will power the first field deployments.
Intel Developer Cloud is a great environment, to test on latest Intel hardware combined with Intel-optimized AI tools and to work with awesome engineers.
– Florian Stanek, Software Developer at Starwit Technologies
Test run to get optimal batch size
One of the many pulls and levers one can tweak in getting optimal performance results is the amount of images that are inferenced, i.e. running object detection, in one go. The following diagram shows results of seven separate benchmark runs.
Figure 1: Batch size measurement results
A benchmark run consists of detecting objects (using YOLOv8, model size nano) on 9000 individual images and benchmarks runs differ in batch size only. Batch size, i.e. how many images are used as input per inference call, is shown right next to every each result. Results are showing that with all other parameters fixed a batch size of 8 is optimal, yielding roughly 120 fps on 12 cores (= threads) of an Intel® Xeon® Platinum 8480+ CPU using AMX through IPEX.
As an inferencing speed of 15 frames per second emerged from product trials as sufficient for traffic analysis, 12 cores thus can handle inferencing for eight cameras fast enough. Those results as well as other benchmarks run are very encouraging to serve AI powered computer vision with Intel® Xeon® CPUs.
As this is just one example of a larger number of benchmarks, you can find code and results in this Github repository: https://github.com/starwit/model-benchmarks. A detailed list of used libraries can be found here: https://github.com/starwit/model-benchmarks/blob/main/requirements.txt
Figure 2: Excerpt Performance Benchmark Script
Conclusion
Real-time data plays a pivotal role in solving traffic issues and enhancing vehicle mobility. While optical cameras serve as the primary data collection tool due to their flexibility, the computational intensity of image processing remains a significant hurdle, particularly within budget constraints.
Addressing the prevalent reliance on GPU-optimized AI algorithms for object detection, Starwit Technologies is well on its way towards revolutionizing smart city applications by proving the feasibility of a pure CPU approach. Collaborating with Intel and leveraging their latest Xeon microprocessor and Intel software tools powered by oneAPI, the company conducted meticulous measurement runs on the Intel® Developer Cloud. With the assistance of Intel engineers and utilizing Intel® Extensions for PyTorch*, Starwit Technologies measured object detection using YOLOv8 with various optimizations.
The business outlook is highly promising, as evidenced by encouraging results. The decision to utilize Intel's Xeon CPUs was reinforced by benchmark runs showcasing optimal performance, with a batch size of 8 emerging as the most efficient, achieving approximately 120 frames per second on 12 cores of an Intel® Xeon Platinum 8480+ CPU.
The practical implication of these results is significant, as the inferencing speed of 15 frames per second proves sufficient for traffic analysis. This means that 12 cores can efficiently handle inferencing for eight cameras simultaneously. Starwit Technologies’ journey underscores the transformative potential of Intel's latest Xeon CPUs in making AI-based computer vision accessible to a broader audience, paving the way for advancements in smart city solutions.
This project was implemented within the scope of Intel® Liftoff for Startups, a free, virtual, startup accelerator program designed to help early-stage tech startups on their path to innovation and growth. The primary mission is to inspire and empower startups with Intel’s leadership in technology.
Notes and Disclaimers
Performance varies by use, configuration, and other factors. Learn more at www.Intel.com/PerformanceIndex.
Your costs and results may vary.
Intel technologies may require enabled hardware, software, or service activation.
Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries.
*Other names and brands may be claimed as the property of others.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.