Unlock the Potential of Parallel Programming with SYCL*: Developer Spotlight

Nikita_Shiledarbaxi · ‎11-14-2023

Learn A-to-Z of the SYCL Framework for Data Parallelism

SYCL* is an open-source, industry standards-based framework for the efficient implementation of parallel programming paradigms. Its multi-vendor and multi-architecture support makes it easy to incorporate data parallelism into applications across heterogeneous hardware. Joel John Joseph, in his ‘30 Days of SYCL Programming’ article series, talks about various SYCL concepts and their practical use cases to elevate your parallel programming skills. The comprehensive tutorial series lets you explore SYCL, including its basics and advanced topics with code implementations, advantages over other parallel programming models, and critical comparisons with CUDA*.

Major SYCL Concepts

The article series covers the following crucial topics about parallel programming with SYCL:

SYCL devices, device selectors, queues, and kernels
Buffer model and code anatomy
Unified Shared Memory (USM) and the concept of subgroups
Buffers and accessors
Task scheduling, data dependencies, and graphs in SYCL
Local memory and atomics in SYCL

Practical Applications of SYCL

The articles walk you through some real-world use cases of SYCL programming, such as:

Image processing and ray tracing
Scientific computations, including numerical solvers, Monte Carlo simulations, and sparse matrix operations
Accelerating machine learning algorithms and graph algorithms
Quantum computing, data visualization, and Virtual Reality (VR) applications
High Performance Computing (HPC) applications such as Computational Fluid Dynamics (CFD) simulations
Enhancing Edge AI applications such as smart surveillance and efficient health monitoring

Check out the complete series to explore the above and several other SYCL topics in detail.

CUDA to SYCL Code Migration

The major advantages of SYCL, such as interoperability, scalability, and support for multi-vendor heterogeneous architectures, give you more freedom to choose an execution platform than the proprietary vendor-locked CUDA solutions. In his day 28 article of the series, Joel provides a performance comparison between CUDA and SYCL, backed by some benchmarking results.

The day 27 article explains the process of manually porting your CUDA code to SYCL through a simple code illustration. However, Intel® DPC++ Compatibility Tool and its open-source counterpart SYCLomatic are the two automated tools that can perform the migration process for you. They automatically migrate the majority (90%-95%)^ of the CUDA source code to C++ with SYCL. You only need to refine the tools’ output (if required) for functional correctness.

Utilize Accelerated SYCL Kernels with Intel® oneAPI DPC++ Library (oneDPL)

The Intel® oneAPI DPC++ Library (oneDPL), an extension of the C++ Standard Template Library (STL), empowers your C++ application with accelerated SYCL kernels across CPUs, GPUs, and FPGAs. It extends the parallel computing libraries such as Parallel STL (PSTL) and Boost.Compute*. It also eases the code migration from CUDA to SYCL by seamlessly integrating with the Intel DPC++ Compatibility Tool. Check out the series’ day 7 and day 8 articles that elaborate on the oneDPL Extension APIs for cross-architecture parallel programming.

Analyze Performance of SYCL Applications with Intel® VTune™ Profiler

Intel® VTune™ Profiler tool helps you analyze, fine-tune, and maximize your application performance. It assists you with various aspects, including but not limited to analyzing hotspots, detecting code anomalies, determining memory consumption and cache misses, and detecting performance issues in I/O-intensive applications. It also provides recommendations on how to fix the performance bottlenecks. The day 6 article of the series describes how to boost a SYCL application performance with Intel VTune Profiler.

What’s Next?

We encourage you to read through the 30 Days of SYCL series and exploit the parallel programming potential of the SYCL framework. Explore some useful resources in the following section to dive deeper into Intel’s tools and libraries to help you achieve data parallelism with SYCL.

Also, check out other AI, HPC, and Rendering tools in Intel’s oneAPI-powered software portfolio.

Additional Resources

About the Author of '30 Days of SYCL'

Joel John Joseph is an Intel Student Ambassador. He is a data analyst, a tech enthusiast, and an Augmented Reality developer pursuing a Master of Computer Applications (MCA) degree from Christ University, Bangalore (India).

^Intel estimates as of March 2023. Based on measurements on a set of 85 HPC benchmarks and samples, with examples like Rodinia, SHOC, and PENNANT. Results may vary.