Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

PDEP/PEXT in Fortran

Steven_V_
초급자
1,676 조회수

Hi,

is there any extension available to use the PDEP/PEXT bit operations in Fortran? Is there any plan to add bit manipulations like that to the Fortran standard?

greetings,

Steven

0 포인트
1 솔루션
Steven_L_Intel1
1,676 조회수

The only reason to use a C routine is if you want to use the Haswell instruction intrinsics on a Haswell processor. If you aren't running on a processor with that instruction set, you'd just get a runtime error trying to use these intrinsics.

One could write a C routine that used "manual CPU dispatch", executing the Haswell instruction(s) if running on a capable processor or generic code if not. If you built this using Intel C++ and the /Qipo option, the optimizer might inline the call. It IS possible, through an undocumented feature, to use many (but not all) instruction intrinsics from Intel Fortran, but I have not studied these bit intrinsics to see if that's possible. You'd still have to detect the instruction set and execute generic code (which you would have to write.) Given that the randombit.net article provided generic C code, doing this bit (!) in C would make more sense to me.

원본 게시물의 솔루션 보기

0 포인트
6 응답
Steven_L_Intel1
1,676 조회수

The Fortran standard has lots of intrinsics for bit manipulation.I haven't heard of PDEP/PEXT - what are these?

0 포인트
mecej4
명예로운 기여자 III
1,676 조회수

Steve Lionel (Intel) wrote:

The Fortran standard has lots of intrinsics for bit manipulation.I haven't heard of PDEP/PEXT - what are these?

Google yielded this article about these bit-level scatter/gather instructions, made available on Haswell CPUs: http://www.randombit.net/bitbashing/2012/06/22/haswell_bit_permutations.html

Perhaps one should expect the Intel C compiler or the IPP library to support these instructions rather than the Fortran compiler? 

0 포인트
Steven_V_
초급자
1,676 조회수

I found those pdep/pext instructions when looking for ways to scatter bits. Apparently there are available on certain Intel architectures (https://software.intel.com/en-us/node/514045). However, I did not find any equivalent intrinsic that does this bit operation for Fortran.

I'm using bit scattering operations inside an inner loop of a performance-critical section in my Fortran code, and I was wondering if using these instructions might speed up that section.

0 포인트
Steven_L_Intel1
1,676 조회수

Are you running the program on a Haswell processor? Fortran doesn't have bit gather/scatter intrinsics. You could call out to an Intel C++ routine and use its instruction intrinsics (again for supported processors only.) 

0 포인트
Steven_V_
초급자
1,676 조회수

I'm not running the program on a Haswell processor myself, but large calculations are usually run on computer clusters, so I should take into account that it might be the case, and then the possible speed-up is definitely important.

So, if I understand correctly, I should create a Fortran function which calls a C function (using bind), then write a C function and compile that with the Intel C compiler and then link. I'm only wondering if in that case the function will be inlined?

0 포인트
Steven_L_Intel1
1,677 조회수

The only reason to use a C routine is if you want to use the Haswell instruction intrinsics on a Haswell processor. If you aren't running on a processor with that instruction set, you'd just get a runtime error trying to use these intrinsics.

One could write a C routine that used "manual CPU dispatch", executing the Haswell instruction(s) if running on a capable processor or generic code if not. If you built this using Intel C++ and the /Qipo option, the optimizer might inline the call. It IS possible, through an undocumented feature, to use many (but not all) instruction intrinsics from Intel Fortran, but I have not studied these bit intrinsics to see if that's possible. You'd still have to detect the instruction set and execute generic code (which you would have to write.) Given that the randombit.net article provided generic C code, doing this bit (!) in C would make more sense to me.

0 포인트
응답