- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
is there any extension available to use the PDEP/PEXT bit operations in Fortran? Is there any plan to add bit manipulations like that to the Fortran standard?
greetings,
Steven
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The only reason to use a C routine is if you want to use the Haswell instruction intrinsics on a Haswell processor. If you aren't running on a processor with that instruction set, you'd just get a runtime error trying to use these intrinsics.
One could write a C routine that used "manual CPU dispatch", executing the Haswell instruction(s) if running on a capable processor or generic code if not. If you built this using Intel C++ and the /Qipo option, the optimizer might inline the call. It IS possible, through an undocumented feature, to use many (but not all) instruction intrinsics from Intel Fortran, but I have not studied these bit intrinsics to see if that's possible. You'd still have to detect the instruction set and execute generic code (which you would have to write.) Given that the randombit.net article provided generic C code, doing this bit (!) in C would make more sense to me.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The Fortran standard has lots of intrinsics for bit manipulation.I haven't heard of PDEP/PEXT - what are these?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Steve Lionel (Intel) wrote:Google yielded this article about these bit-level scatter/gather instructions, made available on Haswell CPUs: http://www.randombit.net/bitbashing/2012/06/22/haswell_bit_permutations.html
The Fortran standard has lots of intrinsics for bit manipulation.I haven't heard of PDEP/PEXT - what are these?
Perhaps one should expect the Intel C compiler or the IPP library to support these instructions rather than the Fortran compiler?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I found those pdep/pext instructions when looking for ways to scatter bits. Apparently there are available on certain Intel architectures (https://software.intel.com/en-us/node/514045). However, I did not find any equivalent intrinsic that does this bit operation for Fortran.
I'm using bit scattering operations inside an inner loop of a performance-critical section in my Fortran code, and I was wondering if using these instructions might speed up that section.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are you running the program on a Haswell processor? Fortran doesn't have bit gather/scatter intrinsics. You could call out to an Intel C++ routine and use its instruction intrinsics (again for supported processors only.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm not running the program on a Haswell processor myself, but large calculations are usually run on computer clusters, so I should take into account that it might be the case, and then the possible speed-up is definitely important.
So, if I understand correctly, I should create a Fortran function which calls a C function (using bind), then write a C function and compile that with the Intel C compiler and then link. I'm only wondering if in that case the function will be inlined?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The only reason to use a C routine is if you want to use the Haswell instruction intrinsics on a Haswell processor. If you aren't running on a processor with that instruction set, you'd just get a runtime error trying to use these intrinsics.
One could write a C routine that used "manual CPU dispatch", executing the Haswell instruction(s) if running on a capable processor or generic code if not. If you built this using Intel C++ and the /Qipo option, the optimizer might inline the call. It IS possible, through an undocumented feature, to use many (but not all) instruction intrinsics from Intel Fortran, but I have not studied these bit intrinsics to see if that's possible. You'd still have to detect the instruction set and execute generic code (which you would have to write.) Given that the randombit.net article provided generic C code, doing this bit (!) in C would make more sense to me.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page