Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Novice
147 Views

3D transformations - matrix multiplication

Jump to solution

Will Intel create a single asm instruction packaging all instructions needed to do 3D transformation around XYZ axes ?

An entire 4×4 matrix can be stored inside a 512-bit register, thus a single instruction operating on two 512-bit registers can do the matrix multiplication as:

__m512 _mm512_mul_mat(__m512 MatA, __m512 MatB);

And same goes for the matrix vector multiplication:

__m128 _mm512_mul_matvec(__m512 Mat, __m512 Vec);

I actually do like this:

wqaxs36_0-1605360364910.png

wqaxs36_1-1605365886254.png

wqaxs36_2-1605360498861.pngwqaxs36_3-1605360518396.png

0 Kudos

Accepted Solutions
Highlighted
Moderator
97 Views

Hi,

Thanks for reaching out us.

To my knowledge, there is no inbuilt functionality for matrix multiplication in Intrinsics and I will get back to you after contacting the internal team whether a new feature to be implemented or and if there is an existing function.


Meanwhile could you please try MKL (Intel math kernel library) which supports several math operations and also highly optimized.

To get started with MKL please refer the following article: https://software.intel.com/content/www/us/en/develop/documentation/get-started-with-mkl-for-windows/...



Regards

Prasanth


View solution in original post

5 Replies
Highlighted
Novice
141 Views

Mistake, it's __m128 _mm512_mul_matvec(__m512 Mat, __m128 Vec); instead

0 Kudos
Highlighted
Moderator
98 Views

Hi,

Thanks for reaching out us.

To my knowledge, there is no inbuilt functionality for matrix multiplication in Intrinsics and I will get back to you after contacting the internal team whether a new feature to be implemented or and if there is an existing function.


Meanwhile could you please try MKL (Intel math kernel library) which supports several math operations and also highly optimized.

To get started with MKL please refer the following article: https://software.intel.com/content/www/us/en/develop/documentation/get-started-with-mkl-for-windows/...



Regards

Prasanth


View solution in original post

Highlighted
92 Views

3D modeling is more often better performed using Structure of Arrays than using Array of Structures.

struct AoS_t
{
  int N;
  Single *X, *Y, *Z, *W; // cache align allocation
  bool Alloc(int _N);
};
...
AoS_t Bodies;

Then depending on CPU instruction set, The operations (*,/,+,-) are performed on the individual properties of 4, 8 or 16 bodies with each operation. This generally exhibits higher performance when operating on large number of bodies (particles).

Jim Dempsey

Highlighted
88 Views

I forgot to mention...

In your sample code in post #1 (AoS), you have two horizontal add instructions followed by a scalar store (for a single particle). In the SoA modeling, you would eliminate those two horizontal adds, and follow it with a vector store (for 4, 8 or 16 particles).

Jim Dempsey

Highlighted
Moderator
64 Views

As Prasanth mentioned we currently don't have an intrinsic that does what you are looking for. I've opened a feature request with our development team.

Thanks,