Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
7256 Discussions

Cache blocking techniques for element-wise math in large arrays - Fortran

art-croucher
Beginner
638 Views
I have a CFD code that does a lot of element-wise (A(i,j)*B(i,j)) math with large arrays. roughly 500x500 R*8s, with most sections of the code using a half dozen of these (1.5MB) arrays at a time.
To clean up the code, and in a naive hope that IFC would figure out the best way to manage the work, we vectorized most of the code.
VTune still shows a lot of time wasted with various stores, even with higher optimizations.
Is there a simple technique or a library that can block these operations to be efficient on a Xeon? Hopefully without rewriting all the code!
Thread moved to MKL from Fortran/Windows.
Thanks,
Art
0 Kudos
0 Replies
Reply