- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
Hi,
is there any extension available to use the PDEP/PEXT bit operations in Fortran? Is there any plan to add bit manipulations like that to the Fortran standard?
greetings,
Steven
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
The only reason to use a C routine is if you want to use the Haswell instruction intrinsics on a Haswell processor. If you aren't running on a processor with that instruction set, you'd just get a runtime error trying to use these intrinsics.
One could write a C routine that used "manual CPU dispatch", executing the Haswell instruction(s) if running on a capable processor or generic code if not. If you built this using Intel C++ and the /Qipo option, the optimizer might inline the call. It IS possible, through an undocumented feature, to use many (but not all) instruction intrinsics from Intel Fortran, but I have not studied these bit intrinsics to see if that's possible. You'd still have to detect the instruction set and execute generic code (which you would have to write.) Given that the randombit.net article provided generic C code, doing this bit (!) in C would make more sense to me.
링크가 복사됨
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
The Fortran standard has lots of intrinsics for bit manipulation.I haven't heard of PDEP/PEXT - what are these?
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
Steve Lionel (Intel) wrote:Google yielded this article about these bit-level scatter/gather instructions, made available on Haswell CPUs: http://www.randombit.net/bitbashing/2012/06/22/haswell_bit_permutations.html
The Fortran standard has lots of intrinsics for bit manipulation.I haven't heard of PDEP/PEXT - what are these?
Perhaps one should expect the Intel C compiler or the IPP library to support these instructions rather than the Fortran compiler?
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
I found those pdep/pext instructions when looking for ways to scatter bits. Apparently there are available on certain Intel architectures (https://software.intel.com/en-us/node/514045). However, I did not find any equivalent intrinsic that does this bit operation for Fortran.
I'm using bit scattering operations inside an inner loop of a performance-critical section in my Fortran code, and I was wondering if using these instructions might speed up that section.
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
Are you running the program on a Haswell processor? Fortran doesn't have bit gather/scatter intrinsics. You could call out to an Intel C++ routine and use its instruction intrinsics (again for supported processors only.)
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
I'm not running the program on a Haswell processor myself, but large calculations are usually run on computer clusters, so I should take into account that it might be the case, and then the possible speed-up is definitely important.
So, if I understand correctly, I should create a Fortran function which calls a C function (using bind), then write a C function and compile that with the Intel C compiler and then link. I'm only wondering if in that case the function will be inlined?
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
The only reason to use a C routine is if you want to use the Haswell instruction intrinsics on a Haswell processor. If you aren't running on a processor with that instruction set, you'd just get a runtime error trying to use these intrinsics.
One could write a C routine that used "manual CPU dispatch", executing the Haswell instruction(s) if running on a capable processor or generic code if not. If you built this using Intel C++ and the /Qipo option, the optimizer might inline the call. It IS possible, through an undocumented feature, to use many (but not all) instruction intrinsics from Intel Fortran, but I have not studied these bit intrinsics to see if that's possible. You'd still have to detect the instruction set and execute generic code (which you would have to write.) Given that the randombit.net article provided generic C code, doing this bit (!) in C would make more sense to me.
