Accelerate PyTorch* Inference with torch.compile on Windows* CPU

SusanK_Intel1 · ‎10-29-2024

We are excited to announce that PyTorch* 2.5 has introduced support for the torch.compile feature on Windows* CPU, thanks to the collaborative efforts of Intel and Meta*. This enhancement aims to speed up PyTorch code execution over the default eager mode, providing a significant performance boost.

Getting Started with torch.compile on Windows CPU

To utilize torch.compile on Windows CPU, a C++ development environment is required. We have prepared a comprehensive guide to help you set up the necessary environment. You can refer to the tutorial How to use TorchInductor on Windows CPU.

Once you have installed the environment successfully, you can try running an example to ensure everything is set up correctly. Here is a simple example you can try:

(pytorch_2.5) D:\xuhan\build_pytorch>type mini_case.py

import torch

def foo(x, y):

a = torch.sin(x)

b = torch.cos(y)

return a + b

opt_foo1 = torch.compile(foo)

print(opt_foo1(torch.randn(10, 10), torch.randn(10, 10)))

(pytorch_2.5) D:\xuhan\build_pytorch>python mini_case.py

tensor([[ 1.9812, -0.2463, -0.0087, 0.3254, 1.8775, -0.9243, 0.9942, 1.7817,

0.2984, 0.0758],

[ 0.8973, -0.1397, 0.7298, 1.4530, 0.9452, 0.1929, 1.2594, -0.2231,

1.4836, -0.4684],

[ 1.0682, 1.8216, 0.5263, 0.1197, -0.1380, 0.0245, -0.5553, -0.5178,

0.3169, -0.9332],

[ 1.4115, 1.0956, 0.4083, 0.4683, 0.2366, 1.5851, -0.0679, 0.6405,

-0.3479, 1.9406],

[ 1.6804, -0.0512, -0.0929, 1.1394, -0.0552, 0.9306, 1.8272, 1.6940,

1.6041, -0.3670],

[ 1.6425, 0.0930, 0.0385, -1.3875, -0.2351, 1.3414, -1.4208, 1.2336,

0.0098, 0.7412],

[ 1.9461, 1.7850, 0.5771, -1.2778, 0.3964, -1.3073, 0.8085, 0.4738,

0.7596, 1.0792],

[ 0.5872, -0.8935, -0.0047, 0.8921, 1.5168, 0.4271, -1.3082, -0.1474,

0.3418, -0.0677],

[ 0.3057, 0.2031, 0.6457, 0.9431, 0.0145, 0.5779, 0.7415, 0.8415,

1.1008, 0.1977],

[ 0.5103, -0.1149, 1.3592, 0.2531, 1.5663, 0.2729, 1.7606, 0.4289,

-0.3515, 1.4577]])

(pytorch_2.5) D:\xuhan\build_pytorch>

Enhanced Performance with Additional Compilers

In addition to supporting Microsoft* Visual C++, we have also enabled support for LLVM Compiler and Intel® oneAPI DPC++/C++ Compiler to boost performance on Windows CPU. To learn more, you can read this article: Intel® oneAPI DPC++/C++ Compiler Boosts PyTorch* Inductor Performance on Windows* for CPU Devices, which includes a comprehensive step-by-step guide with performance data.

Acknowledgements

This achievement would not have been possible without the hard work and dedication of the following engineers: Xu Han, Jiong Gong, Bin Bao, Jason Ansel, Chuanqi Wang, Henry Tsang, Xiao Wei, Weizhuo Zhang and Zhaoqiong Zheng.

We are thrilled to bring this powerful feature to the PyTorch community. We look forward to seeing the innovative ways you will leverage torch.compile to accelerate your PyTorch projects on Windows CPU.