- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm struggling to understand if I can iterate through a __m128i memory segment (WITHOUT ISSUES) or if load/store intrinsics are required? Emphasis on 'work without issues' because my code operates correctly in most cases but I begin to see odd behavior when my system is starved for resources. I don't see runtime exceptions/errors but I am not checking any intrinsic return values/status/etc. Please ignore syntax errors as the code here is just for the sake of discussion/questions.
First, a task allocates multiple __m128i memory segments and I save the returned values:
for (int i = 0; i < SOME_N; i++)
{
__m128i *pFrame = (__m128i *)_mm_malloc(sizeof(__m128i) * SOME_LENGTH, sizeof(__m128i));
someList[i] = pFrame;
}
Some other task will extract pointers from that list and copy (8-bit non-intrinsic-type) data into that memory:
for (int i = 0; i < SOME_LENGTH; i++)
{
// Will this work?
pFrame[i] = _mm_insert_epi8(pFrame[i], pData[i], 0);
// Or do I need to do something like this?
__m128i p128i =_mm_load_128(pFrame[i]);
_mm_store_128(p128i, _mm_insert_epi8(p128i, pData[i], 0));
}
Similarly I will need to pull the data out at the end:
for (int i = 0; i < SOME_LENGTH; i++)
{
// Does this work?
pData[i] = _mm_extract_epi8(pFrame[i], 0);
// Or do I need to do something like this?
pData[i] = _mm_extract_epi8(_mm_load_128(pFrame[i]),0);
}
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thank you for posting in intel communities.
>>” I begin to see odd behavior when my system is starved for resources”
Could you please elaborate more on the difficulty you are facing? Could you please provide the complete sample reproducer code and steps to reproduce your issue at our end?
And also, please share the platform details, operating system, Intel compiler, and oneAPI toolkit version you are using.
Please refer to the provided URL for more information about Intel intrinsics.
URL: intel.com/content/www/us/en/docs/intrinsics-guide/index.html
Best regards,
Madhu
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Unfortunately, I have a proprietary decoder/system that I cannot share. This question is more about proper use of the API than it is a question of 'what is wrong with my system?'. I've found that my code 'works' under optimal conditions without using mm_load/mm_store but am trying to understand the undefined behavior in non-optimal conditions. I can re-write my code to use load/store but that would take several days of work/testing. My hope was that someone could tell me 'the epi insert/extract methods should be sufficient' (in which case I would look elsewhere for the problem) --OR-- 'yes intended use of the API is for you to utilize the load/store methods to access the __m128i memory blocks' before I spend several days on what might be a dead end
Is that enough information to answer yes or no if the mm_load/mm_store operations should be used in the provided example code?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To my knowledge, you don't need to have mm_load/mm_store.
Thanks,
Viet
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Let's close this thread. If you have any other questions/concerns, please create a new one.
Thanks,
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page