Authors: Gor Hakobyan, Levon Budagyan, Rahul Unnikrishnan Nair, Desmond Grealy
Intel® Liftoff member, Waveye developed a centralized safety solution for production environments based on high-resolution imaging radars. This project’s focus was on training perception models for classification of different object classes based on high-density radar point clouds.
This article presents the results of performance acceleration tests conducted on their radar processing and learning pipeline using Intel® oneAPI libraries. These tests demonstrated significant performance improvements of 20-50x compared to the raw CPU implementation, enabling faster AI lifecycle iteration for core radar perception algorithms.
Background on the project
Waveye is developing a radar based centralized safety system for worker safety in automated factories. The project goal is to achieve privacy preserving detection and tracking of humans, automated guided vehicles (AGVs) and forklifts operating in the same environment.
The system is capable of covering large production areas, is robust to lighting conditions, and does not require the objects to carrie tags for identification.
The aim was to develop a perception stack that is able to achieve >99% human detection performance in less than a second, as well as similarly robust detection of robots and forklifts.
“For our collaboration with Intel, we wanted to validate how efficiently we can migrate our edge processing pipeline to x86 architecture using hardware abstraction tools such as Intel® OneAPI Toolkit on Intel® Tiber™ AI Cloud.” - Dr. Gor Hakobyan, CTO, Waveye.
The radar perception model detects and classifies objects and tracks them over time. In case of collision risk, it can send a signal to robots to slow down or choose a collision-free trajectory.
The system consists of an end-to-end radar processing and learning pipeline originally implemented using:
- CUDA acceleration for edge processing
- CUDA or CPU processing in cloud environments
The primary goal of this testing was to evaluate the effectiveness of Intel® oneAPI libraries for accelerating their pipeline in Intel cloud environments, specifically focusing on the Math Kernel Library (MKL) acceleration capabilities.
Some steps in their processing pipeline could be directly mapped to oneMKL routines. However, the rest of the pipeline relies on custom CUDA kernels.
To generalize these in an easily embeddable way, the Waveeye team chose Halide as its philosophy maps well to radar cube processing operations, and it is easy to embed into existing C++ code. Halide, accelerated by MKL, provided major speed-up compared to naive CPU implementation without much manual optimization required.
"Working with oneMKL and the Intel® oneAPI suite on Intel® Tiber™ AI Cloud was a breeze. There is a certain beauty in breaking down our radar processing pipeline into linear algebra operations—and letting decades of numerical method advancements extract maximum performance from the hardware." Levon Budagyan, CEO, Waveye.
Testing Methodology
They benchmarked key computational components of their radar processing pipeline, comparing:
- Raw CPU implementation (baseline)
- MKL-accelerated implementation
Each component was executed multiple times with representative workloads to ensure consistent measurement. Timing was captured with microsecond precision.
Results
Overall Performance
The Intel® oneAPI libraries, particularly MKL, provided acceleration factors ranging from 20x to 50x compared to the baseline CPU implementation across various computational kernels.
Detailed Performance Improvements
Operation | CPU Runtime (ms) | MKL Runtime (ms) | Acceleration Factor |
FFT | 1,700 | 50 | 34x |
GEMM | 50,000 | 450 | 111x |
Specialized kernel1 | 500 | 12 | 42x |
Specialized kernel2 | 250 | 5 | 50x |
Impact on Workflow
This performance improvement provides several key benefits:
- Edge Simulation: Enables effective simulation of edge processing in the cloud environment
- Development Acceleration: Significantly faster AI lifecycle iteration for radar perception algorithms
- Resource Optimization: Reduced computational resource requirements and associated costs
- Iteration Speed: Faster experimentation cycles for algorithm development and testing
Conclusion: Accelerating Radar Processing with Intel® oneAPI
The integration of Intel® oneAPI libraries—most notably Intel® Math Kernel Library (MKL)—has delivered significant performance enhancements to Waveye’s radar processing pipeline. Achieving a 20-50x speedup over the baseline CPU implementation, this optimization empowers more efficient development and testing cycles while minimizing computational resource demands.
These outcomes affirm Intel® oneAPI as a robust and scalable acceleration platform, particularly well-suited for compute-intensive radar processing workloads deployed in Intel cloud environments.
Next Steps: Advancing Optimization and Performance
Building on these promising results, Waveye plans to pursue the following initiatives:
Further Optimization of Specialized Kernels: Continue refining specific computational kernels to maximize performance gains.
Extended Testing Across Varied Operational Scenarios: Validate performance improvements under diverse real-world conditions and workloads.
Evaluation of Additional Intel® oneAPI Toolkit Components: Explore other libraries and tools within the oneAPI ecosystem to identify further acceleration opportunities.
Exploration of Hybrid Acceleration Strategies: Investigate combining Intel® MKL with other specialized libraries to create hybrid solutions that optimize performance across different processing stages.
These next steps aim to deepen integration with Intel® technologies and unlock additional efficiency for Waveye’s radar processing applications.
About Waveye
Waveye is a technology startup based in Palo Alto and Stuttgart providing spatio-dynamics perception for robotics based on high-resolution imaging radar.
Levon is an exited founder who has built a company in Machine Learning and Software space to a successful global enterprise with over $115M revenue. Gor is a radar expert with over 100 patent filings on sensing technologies.
Launch on your own clock
Intel® Liftoff is a free, fully online accelerator for early-stage AI startups everywhere. No cohorts, no equity, no barriers, just progress. Apply today.
Related resources
Intel® Tiber™ AI Cloud - Cloud platform for AI development and deployment
Intel® oneAPI Base Toolkit - Comprehensive development tools for high-performance applications
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.