- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is there any way we could target specific instruction sets instead of specific micro-architectures using the same dispatching technique (__declspec(cpu_specific) and __declspec(cpu_dispatch)) which already exists in the compiler?
For example:
__declspec(cpu_specific(SSE2))
void foo(void)
{
// SSE2 code here
}
And
__declspec(cpu_specific(SSE4.2))
void foo(void)
{
// SSE4.2 code here
}
And then
__declspec(cpu_dispatch(SSE2, SSE4.2))
void foo(void)
{
}
I understand that SSE4.2 is not (yet) supported by non-Intel CPUs, but SSE2 is.
I can always write my own dispatcher based on CPUID testing, but native support by compiler would be much easier to work with.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Am Imissing something here?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Maybe I am mistaken, but If I understand manual dispatch correctly when I write __declspec(cpu_specific(pentium_4)) the function will only execute on Pentium 4 CPUs because manual dispatch is targeting micro-architecture (0Fxx), and not instruction set extension (SSE2).
If I only used SSE2 code to write the function, I would prefer that the function gets executed on any CPU which supports SSE2.
Currently, the only way to accomplish that which I am aware of is to write my own CPUID checking and dispatching code. Is there any better solution you could suggest?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
cpu_specific seems to have been dropped from the current docs (at least, I can't find it there).
Perhaps your point is that it was never updated to use the -xSSExxx notations rather than specific early CPU models.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is confusing to me as well, so it is perhaps better to rephrase it.
- I would like to create an application which will have SSE2, SSSE3, and SSE4.1 code paths for critical functions.
- I will write each of those code paths manually using inline assembler.
- I would like to use manual dispatch provided by the compiler.
The question is:
How can I manually dispatch SSE2 code to be executed on every CPU that supports SSE2 including for example AMD Athlon 64?
Note that I do not want to execute SSE2 code on SSSE3 capable CPU.
It is true that cpu_specific was not updated to new notation. My guess is because cpu_specific still targets micro-architecture rather than a specific instruction set extension.
Hope I explained it now.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think it may be best to create global variables or global bit mask then write a CPU specific filter. Use your own CPUID (or one you trust) to poll the system for supported features.
Then wse a functor to dispatch to your functions.
Initialize the functor foreach function to point to a platform determination filter
typedef void void_fn_void();
void fooSelect();
void_fn_void* foo = fooSelect;
...
void fooSelect()
{
if(HasAVX())
foo = fooAVX();
elsif(HasSSE4_1())
foo = fooSSE4_1();
...
if(HasMMX())
foo = fooMMX();
foo();
}
...
foo(); // 1st time calls fooSelect()
foo(); // remainder calls to appropriate foo without feature filtering
It will mean some work onyour part,at least until the compiler writers convert from "which processor" to "which feature".
Maybe you need to push for
__declspec(feature_specific(SSE3)) // ...
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Indeed, I think it is time to make __declspec(feature_specific(SSE3)) dispatcher. Architecture specific dispatcher doesn't make sense anymore, and it should be deprecated in favor of this one.
I will make a feature request for it on Premier Support because I really do not feel like writing and maintaining that code for God knows how many functions when the tool could do that for me.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
jimdempseyatthecove wrote:
Igor,
I think it may be best to create global variables or global bit mask then write a CPU specific filter. Use your own CPUID (or one you trust) to poll the system for supported features.
Then wse a functor to dispatch to your functions.
Initialize the functor foreach function to point to a platform determination filtertypedef void void_fn_void();
void fooSelect();
void_fn_void* foo = fooSelect;
...
void fooSelect()
{
if(HasAVX())
foo = fooAVX();
elsif(HasSSE4_1())
foo = fooSSE4_1();
...
if(HasMMX())
foo = fooMMX();
foo();
}
...
foo(); // 1st time calls fooSelect()
foo(); // remainder calls to appropriate foo without feature filteringIt will mean some work onyour part,at least until the compiler writers convert from "which processor" to "which feature".
Maybe you need to push for
__declspec(feature_specific(SSE3)) // ...
Jim Dempsey
Jim,
Is there a simple Header which have functions like 'hasSSE4()', 'hasAVX()', 'hasAVX2()', etc?
Header which is portable (Windows, macOS, Linux)?
Thank You.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page