- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is there an easy way to extract component 0 from _mm512 vector ?
Looking at assembly of _mm512_reduce_gmin_ps it really computes an _mm512 (of course), which is then passed to scalar operations.
I tried doing
static inline float _mm512_get_first_ps(_mm512 v)
{
return v.__m512_f32[0] ;
}
but this does not work..
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
v is of struct data type and by looking at your return statement I cannot see structure dereference operator(bad rendering on my screen).Have you tried to dereference a member array by pointer?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can store the result to an array of 16 floats and then load the desired value from the array.
If this is not in the middle of a really important loop, then the overhead of the store and load should be negligible. Of course it depends on what you are going to do with it next -- the store/load approach is convenient when you just want to print the value, for example. If it is going to feed into more arithmetic, then you would probably want to use vector mask/shift/merge/broadcast/etc operators.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, I am trying to find a way without using pointers.. There are many applications - from custom reduction functions to a custom reciprocal for scalar code.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello vladimir,
I also want to know how to convert from float* to _m512. Can you post here if you have found the answer to it.
Also, I want to write a function where i can pass a float value.. it returns me _m512 vector each index contain the element that was passed a parameter.
One way i thought of doing it is like below.
_m512 returnM512(float a)
{
_m512 b;
float* arr = (float*) malloc(sizeof(float)*16);
load elements from "arr" to "b".
return b;
}
But I want to know is there any intrinsic to do it?.. I dont want to use malloc function as my program is multi threaded. Each thread try to do the same function above. If I use malloc it introduces a lock in the code. Can some one help me on this?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
shiva rama krishna bharadwaj I. wrote:
I also want to know how to convert from float* to _m512. Can you post here if you have found the answer to it.
If your pointer is aligned, just use a load intrinsic:
__m512 r = _mm512_load_ps(p);
For unaligned data, you will need to split the load:
r = _mm512_loadunpacklo_ps(r, p); r = _mm512_loadunpackhi_ps(r, p+16);
Also, I want to write a function where i can pass a float value.. it returns me _m512 vector each index contain the element that was passed a parameter.
This intrinsic is _mm512_set1_ps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you sylvain..
_mm512_set1_ps served my problem.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page