- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
cpu_dispatch is a nice feature
How can I use cpu_dispatch to differentiate architectures where shld instruction (1 cycle on SNB only) is faster than rotl (faster on all previous and following cpu types, like NHM and IVB)
How can I get the list of supported identifiers which seem to follow an exotic naming convention "core_2nd_gen_avx" "core_i7_sse4_2" ...
Is it possible to use documented cpuid flags in cpu_dispatch clause (like sse4.2,aes,pclmul,rtm) , rather than undocumented names ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
there are several new intrinsics:
extern void __ICL_INTRINCC _allow_cpu_features(unsigned __int64);
extern int __ICL_INTRINCC _may_i_use_cpu_feature(unsigned __int64);
please see the immintrin.h for details about the possible parameter value. See if it's detail enough for your need. Also see this article about the "_allow_cpu_features()".
Jennifer
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
+1 from me
+1 for implementation under Windows too.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
iliyapolak wrote:
Does it help?
thank you for the great article and for your effort. just out of my curiosity, how did you found this? what google keywords/or your own memory?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You are welcome.
I simply entered this keyword: __declspec(cpu_dispatch()) in google search bar.
Btw. I have never used __declspec() keyword with cpu_dispatch() modifier.I simply query the cupid for existence of specific technology(SSEn,AVX).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, This is the right link for cpu_dispatch(figure_out_the_list)
This does not help to figure out at run time if the processor supports a fast shld/rotl (word rotation is heavily used in cryptography)
This does not help with processors 4 generation where aes has been disabled for export. No way to use avx2 instructions.
More generally, there is a complete lack of granularity for options not enabled in virtualized environments, options fused out, options disabled in bios....
Using CPU_ID every time is a killer for performance. When we use highly optimized architecture-specific code to save cycles, it is not a good idea to waste them again in a serializing instruction .....
Looking at the implementation, the generated code by ICC compiler is suboptimal : it is based on the test of a constant initialized once, and when there are 5 or 6 versions of IA architecture (generic C, assembler, SSE2, AVX, AVX2, w/a and w/o AES-NI) ... the last architecture in the list get the impact of all tests. Something like the following written in C generates "better" logic.
// in a C file
fn_sse(args) {
// sse version
}
fn_generic(args) {
// generic version
}
fn_check_and_generic{args)
{
// use the right pointer at first usage
if (cpuid(sse)) fn_pointer = fn_sse;
if (cpuid(generic)) fn_pointer = fn_generic;
fn_generic(args)
}
extern inline fn_pointer = fn_check_and_generic
And in a header file
extern inline fn_pointer;,
inline fn {args) {
fn_pointer(args) // dispatch here, first call chnage the pointer
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
there are several new intrinsics:
extern void __ICL_INTRINCC _allow_cpu_features(unsigned __int64);
extern int __ICL_INTRINCC _may_i_use_cpu_feature(unsigned __int64);
please see the immintrin.h for details about the possible parameter value. See if it's detail enough for your need. Also see this article about the "_allow_cpu_features()".
Jennifer
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Jennifer for this valuable information.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
iliyapolak wrote:
You are welcome.
I simply entered this keyword: __declspec(cpu_dispatch()) in google search bar.
Btw. I have never used __declspec() keyword with cpu_dispatch() modifier.I simply query the cupid for existence of specific technology(SSEn,AVX).
Could you share how you do so?
Thank You.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jennifer J. (Intel) wrote:
there are several new intrinsics:
extern void __ICL_INTRINCC _allow_cpu_features(unsigned __int64);
extern int __ICL_INTRINCC _may_i_use_cpu_feature(unsigned __int64);please see the immintrin.h for details about the possible parameter value. See if it's detail enough for your need. Also see this article about the "_allow_cpu_features()".
Jennifer
Hi Jennifer,
Could it be used for Run Time optimization?
Namely the Compiler will create few code paths and chose one by features (Not by specific CPU like other attribute) on Run Time?
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page