Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software Development Tools (Compilers, Debuggers, Profilers & Analyzers)
- Intel® C++ Compiler
- 3D transformations - matrix multiplication

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Highlighted

wqaxs36

Novice

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-14-2020
06:59 AM

147 Views

Will Intel create a single asm instruction packaging all instructions needed to do 3D transformation around XYZ axes ?

An entire 4×4 matrix can be stored inside a 512-bit register, thus a single instruction operating on two 512-bit registers can do the matrix multiplication as:

__m512 _mm512_mul_mat(__m512 MatA, __m512 MatB);

And same goes for the matrix vector multiplication:

__m128 _mm512_mul_matvec(__m512 Mat, __m512 Vec);

I actually do like this:

Accepted Solutions

Highlighted

PrasanthD_intel

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-17-2020
04:29 AM

97 Views

Hi,

Thanks for reaching out us.

To my knowledge, there is no inbuilt functionality for matrix multiplication in Intrinsics and I will get back to you after contacting the internal team whether a new feature to be implemented or and if there is an existing function.

Meanwhile could you please try MKL (Intel math kernel library) which supports several math operations and also highly optimized.

To get started with MKL please refer the following article: https://software.intel.com/content/www/us/en/develop/documentation/get-started-with-mkl-for-windows/...

Regards

Prasanth

5 Replies

Highlighted

wqaxs36

Novice

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-14-2020
07:02 AM

141 Views

Mistake, it's __m128 _mm512_mul_matvec(__m512 Mat, __m128 Vec); instead

Highlighted

PrasanthD_intel

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-17-2020
04:29 AM

98 Views

Hi,

Thanks for reaching out us.

To my knowledge, there is no inbuilt functionality for matrix multiplication in Intrinsics and I will get back to you after contacting the internal team whether a new feature to be implemented or and if there is an existing function.

Meanwhile could you please try MKL (Intel math kernel library) which supports several math operations and also highly optimized.

To get started with MKL please refer the following article: https://software.intel.com/content/www/us/en/develop/documentation/get-started-with-mkl-for-windows/...

Regards

Prasanth

Highlighted

jimdempseyatthecove

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-17-2020
05:27 AM

92 Views

3D modeling is more often better performed using Structure of Arrays than using Array of Structures.

struct AoS_t

{

int N;

Single *X, *Y, *Z, *W; // cache align allocation

bool Alloc(int _N);

};

...

AoS_t Bodies;

Then depending on CPU instruction set, The operations (*,/,+,-) are performed on the individual properties of 4, 8 or 16 bodies with each operation. This generally exhibits higher performance when operating on large number of bodies (particles).

Jim Dempsey

Highlighted

jimdempseyatthecove

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-17-2020
05:52 AM

88 Views

I forgot to mention...

In your sample code in post #1 (AoS), you have two horizontal add instructions followed by a scalar store (for a single particle). In the SoA modeling, you would eliminate those two horizontal adds, and follow it with a vector store (for 4, 8 or 16 particles).

Jim Dempsey

Highlighted

Viet_H_Intel

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-18-2020
10:01 AM

64 Views

As Prasanth mentioned we currently don't have an intrinsic that does what you are looking for. I've opened a feature request with our development team.

Thanks,

For more complete information about compiler optimizations, see our Optimization Notice.