Intel® Gaudi® AI Accelerator
Support for the Intel® Gaudi® AI Accelerator
25 Discussions

Reproducible HPC Simulations on Habana Gaudi (AWS DL1) — Performance & Determinism Insights

EstPaul
Beginner
434 Views
Hello everyone,

I’m currently evaluating Habana Gaudi performance for a set of reproducible, non-AI algorithmic workloads.
These simulations focus on deterministic numerical kernels, mixed-precision solvers, and parallel reproducibility validation across CPU, GPU, and Gaudi architectures.

The objective is to measure scaling behavior, reproducibility drift, and numerical stability under short, high-intensity runs — the kind often used in algorithmic benchmarking and scientific test cascades.

I’d appreciate insights from Intel engineers or other users regarding:
• Recommended Gaudi SDK / PyTorch / driver combinations for maximum stability.
• Techniques to ensure deterministic tensor and communication behavior across multiple runs.
• Suitable profiling tools for memory throughput, inter-core latency, and reproducibility verification.
• Any known differences when running non-training computational kernels (e.g., mathematical solvers vs AI models).

My goal is to establish a reproducible baseline to compare Gaudi’s deterministic performance against other architectures in controlled HPC environments.

Any guidance or technical references would be highly appreciated.

Thanks in advance,
p.

0 Kudos
0 Replies
Reply