- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How can i run _mm512_popcnt_epi32 on the colfax KNL7210? Does vpopcntd instruction need to be enabled? How can i do this?
I also need to include <iostream> in the C file, however I am using icc. How can I do this?
#include <stdio.h> #include <mkl.h> #include <immintrin.h> #include <zmmintrin.h> int main(){ __m512i k, b, c; c = _mm512_and_epi32(k, b); printf("%d", c); int len; len = _mm512_popcnt_epi32(c); printf("%d\n", len); }
I have another doubt. Does the function _mm512_popcnt_epi32 return a pointer to an array of 16 integers denoting the population count of each of the 16 integers packed into the _m512i data type? Is assigning this to an integer incorrect? I thought that the function above returns the total popcnt of all the 512 elements in the _m512i data type.
I am running the code above using the following submission script:
cd ~/benchmarking/ icc matmul.c -o mat.out -xMIC-AVX512 ./mat.out
Is there something I am missing here?
Thank you!
- Tags:
- Intel® Advanced Vector Extensions (Intel® AVX)
- Intel® Streaming SIMD Extensions
- Parallel Computing
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
According to the table on the AVX512 Wikipedia page (scroll right to the bottom) (which I believe is accurate) , the vponcnt instructions are not implemented in the Xeon Phi 72xxx series. So, unless you have a very small soldering iron and an electron microscope to re-engineer the chip :-), you can't do what you want to. Indeed, there seem to be no, currently shipping, cores which have those instructions.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
According to the table on the AVX512 Wikipedia page (scroll right to the bottom) (which I believe is accurate) , the vponcnt instructions are not implemented in the Xeon Phi 72xxx series. So, unless you have a very small soldering iron and an electron microscope to re-engineer the chip :-), you can't do what you want to. Indeed, there seem to be no, currently shipping, cores which have those instructions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Cownie, James H (Intel) wrote:
According to the table on the AVX512 Wikipedia page (scroll right to the bottom) (which I believe is accurate) , the vponcnt instructions are not implemented in the Xeon Phi 72xxx series. So, unless you have a very small soldering iron and an electron microscope to re-engineer the chip :-), you can't do what you want to. Indeed, there seem to be no, currently shipping, cores which have those instructions.
I was looking at the wiki page, and it says that the instruction: VPOPCNTD is in the extension set VPOPCNTDQ. The Skylake-SP, Skylake-X processors (2017) support: AVX-512 F, CD, BW, DQ, VL.
The Xeon platinum 8180 has
AVX512DQ | AVX-512 Double and Quad |
---|
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
was looking at the wiki page, and it says that the instruction: VPOPCNTD is in the extension set VPOPCNTDQ. The Skylake-SP, Skylake-X processors (2017) support: AVX-512 F, CD, BW, DQ, VL.
The Xeon platinum 8180 has
AVX512DQ
AVX-512 Double and Quadas a supported extension.
Does that mean that the latest lineup of Xeon based on the Skylake architecture can run the intrinsic (_mm512_popcnt_epi32) ? Doesn't the DevCloud's Xeon Scalable processor family include Xeon Platinum 8180/Skylake arch CPUs which support AVX512DQ?
You are confusing the DQ extensions with the VPOPCNTDQ extensions. If you look at that table again, you'll see that it has separate columns for DQ and VPOPCNTDQ.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Cownie, James H (Intel) wrote:
was looking at the wiki page, and it says that the instruction: VPOPCNTD is in the extension set VPOPCNTDQ. The Skylake-SP, Skylake-X processors (2017) support: AVX-512 F, CD, BW, DQ, VL.
The Xeon platinum 8180 has
AVX512DQ
AVX-512 Double and Quadas a supported extension.
Does that mean that the latest lineup of Xeon based on the Skylake architecture can run the intrinsic (_mm512_popcnt_epi32) ? Doesn't the DevCloud's Xeon Scalable processor family include Xeon Platinum 8180/Skylake arch CPUs which support AVX512DQ?
You are confusing the DQ extensions with the VPOPCNTDQ extensions. If you look at that table again, you'll see that it has separate columns for DQ and VPOPCNTDQ.
I apologize, seemed to have missed that in a hurry. I observe that VPOPCNTDQ will be included in Knights Mill and Ice Lake. I suppose I will have to wait till Knights Mill arrives later this year.
Thank you for the assistance !
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page