- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm trying to write code by VNNI
There is a data type - __m512i
which I think is mapping to registers on CPU.
I'd like to locate an array of registers
Here is the code
#include <immintrin.h> int main() { const int size = 5; __m512i zero[size]; for(int i=0; i<size; i++) { zero = _mm512_setzero_si512(); } return 0; }
It works.
But when I try dynamic allocate memory
It doesn't work
#include <immintrin.h> int main() { const int size = 5; __m512i *zero = new __m512i[size]; for(int i=0; i<size; i++) { zero = _mm512_setzero_si512(); } delete [] zero; return 0; }
Is there any way to create registers dynamically?
Lot of thanks
BR,
chiungliang
- Tags:
- CC++
- Development Tools
- General Support
- Intel® C++ Compiler
- Intel® Parallel Studio XE
- Intel® System Studio
- Optimization
- Parallel Computing
- Vectorization
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>There is a data type - __m512i...which I think is mapping to registers on CPU....I'd like to locate an array of registers
__m512i is a data type, and not a register. As to if a variable declared as such can remain in a register for its lifetime will depend on optimization options and source code. Each hardware thread on a CPU that supports AVX512 has 32 of these registers. Your sample code using stack local storage for an array of 5 of these type can be determined at compile time that an optimization can locate these in registers as opposed to on stack (provided optimization level permits this). The code using operator new assures that the location is in memory as opposed to being permitted to be located solely in registers
Consider coding this way:
#include <immintrin.h> int main() { ... some code here before performance critical section { // create nested scope const int size = 5; __m512i zero = __m512i[size]; for(int i=0; i<size; i++) { zero = _mm512_setzero_si512(); } .... code using (hopefully) registered } // end scope .. remainder code return 0; }
*** If your array zero is intended to always contain vectors of zero then do not create such an array. Instead use _mm512_setzero_si512(). This is not a function call, rather it will insert an instruction to zero the targeted variable (of __m512i type).
IOW your array zero might be __m512i sum[size] that you pre-zero before accumulating a sum (of other __512i types).
Jim Dempsey
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Chiungliang,
Could you please elaborate more on the issue which you are facing? and please attach the logs and steps to reproduce. So that we can investigate your issue.
Regards
Goutham
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>There is a data type - __m512i...which I think is mapping to registers on CPU....I'd like to locate an array of registers
__m512i is a data type, and not a register. As to if a variable declared as such can remain in a register for its lifetime will depend on optimization options and source code. Each hardware thread on a CPU that supports AVX512 has 32 of these registers. Your sample code using stack local storage for an array of 5 of these type can be determined at compile time that an optimization can locate these in registers as opposed to on stack (provided optimization level permits this). The code using operator new assures that the location is in memory as opposed to being permitted to be located solely in registers
Consider coding this way:
#include <immintrin.h> int main() { ... some code here before performance critical section { // create nested scope const int size = 5; __m512i zero = __m512i[size]; for(int i=0; i<size; i++) { zero = _mm512_setzero_si512(); } .... code using (hopefully) registered } // end scope .. remainder code return 0; }
*** If your array zero is intended to always contain vectors of zero then do not create such an array. Instead use _mm512_setzero_si512(). This is not a function call, rather it will insert an instruction to zero the targeted variable (of __m512i type).
IOW your array zero might be __m512i sum[size] that you pre-zero before accumulating a sum (of other __512i types).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I write the code,
compile the code by
"g++-9 *.cpp -march=cascadelake"
and execute it
Get an error
"Segmention fault (core dumped)
There is no other message
Thanks,
chiungliang
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Chiungliang,
Can you please try compiling your code with the intel compiler and let us know if your problem still persists.
Thanks
Goutham
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Would you please let me know where can I download Intel compiler?
My OS version is ubuntu 18.04
Lot of thanks
chiungliang
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Chiungliang,
Please install oneAPI Basekit and oneAPI HPC Toolkit. So, that you can use the Intel C++ compiler.
Find the below links for the installation guide and download link for Basekit and HPC toolkit.
Download link: https://software.intel.com/en-us/oneapi
Installation Guide: https://software.intel.com/en-us/articles/installation-guide-for-intel-oneapi-toolkits
Let us know if you face any further issues.
Regards
Goutham
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Chiungliang,
Please confirm if your issue is resolved.
Thanks
Goutham
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Chiungliang,
We are closing this thread.
Please feel free to raise a new thread in case of any further issues.
Thanks
Goutham
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page