Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

__declspec(cpu_specific)

levicki
Valued Contributor I
615 Views
Hi Intel compiler engineers :)

Is there any way we could target specific instruction sets instead of specific micro-architectures using the same dispatching technique (__declspec(cpu_specific) and __declspec(cpu_dispatch)) which already exists in the compiler?

For example:

__declspec(cpu_specific(SSE2))
void foo(void)
{
// SSE2 code here
}

And

__declspec(cpu_specific(SSE4.2))
void foo(void)
{
// SSE4.2 code here
}


And then

__declspec(cpu_dispatch(SSE2, SSE4.2))
void foo(void)
{
}

I understand that SSE4.2 is not (yet) supported by non-Intel CPUs, but SSE2 is.

I can always write my own dispatcher based on CPUID testing, but native support by compiler would be much easier to work with.
0 Kudos
8 Replies
Om_S_Intel
Employee
615 Views
The Micro-architecture enhancements are exposed using -xSSE2, -xSSE3, -xSSE4.2 -xAVXcompiler optipons.Your proposed feature requestis same as targeting manual processor despatch which you already know.

Am Imissing something here?

0 Kudos
levicki
Valued Contributor I
615 Views
Hi Om,

Maybe I am mistaken, but If I understand manual dispatch correctly when I write __declspec(cpu_specific(pentium_4)) the function will only execute on Pentium 4 CPUs because manual dispatch is targeting micro-architecture (0Fxx), and not instruction set extension (SSE2).

If I only used SSE2 code to write the function, I would prefer that the function gets executed on any CPU which supports SSE2.

Currently, the only way to accomplish that which I am aware of is to write my own CPUID checking and dispatching code. Is there any better solution you could suggest?
0 Kudos
TimP
Honored Contributor III
615 Views
I'm confused also as to what the goal might be. Do you want to supply, for example, both SSE2 and SSSE3 code, but execute SSE2 on a CPU which can accept either but runs better with SSE2?
cpu_specific seems to have been dropped from the current docs (at least, I can't find it there).
Perhaps your point is that it was never updated to use the -xSSExxx notations rather than specific early CPU models.
0 Kudos
levicki
Valued Contributor I
615 Views
Hi Tim,

It is confusing to me as well, so it is perhaps better to rephrase it.

- I would like to create an application which will have SSE2, SSSE3, and SSE4.1 code paths for critical functions.
- I will write each of those code paths manually using inline assembler.
- I would like to use manual dispatch provided by the compiler.

The question is:

How can I manually dispatch SSE2 code to be executed on every CPU that supports SSE2 including for example AMD Athlon 64?

Note that I do not want to execute SSE2 code on SSSE3 capable CPU.

It is true that cpu_specific was not updated to new notation. My guess is because cpu_specific still targets micro-architecture rather than a specific instruction set extension.

Hope I explained it now.
0 Kudos
levicki
Valued Contributor I
615 Views
I explained what I need to do, does anyone have any ideas how to accomplish it using Intel Compiler or I really have to write my own dispatcher?
0 Kudos
jimdempseyatthecove
Honored Contributor III
615 Views
Igor,

I think it may be best to create global variables or global bit mask then write a CPU specific filter. Use your own CPUID (or one you trust) to poll the system for supported features.

Then wse a functor to dispatch to your functions.
Initialize the functor foreach function to point to a platform determination filter

typedef void void_fn_void();
void fooSelect();
void_fn_void* foo = fooSelect;
...
void fooSelect()
{
if(HasAVX())
foo = fooAVX();
elsif(HasSSE4_1())
foo = fooSSE4_1();
...
if(HasMMX())
foo = fooMMX();
foo();
}
...
foo(); // 1st time calls fooSelect()
foo(); // remainder calls to appropriate foo without feature filtering

It will mean some work onyour part,at least until the compiler writers convert from "which processor" to "which feature".

Maybe you need to push for

__declspec(feature_specific(SSE3)) // ...

Jim Dempsey
0 Kudos
levicki
Valued Contributor I
615 Views
Hello Jim,

Indeed, I think it is time to make __declspec(feature_specific(SSE3)) dispatcher. Architecture specific dispatcher doesn't make sense anymore, and it should be deprecated in favor of this one.

I will make a feature request for it on Premier Support because I really do not feel like writing and maintaining that code for God knows how many functions when the tool could do that for me.
0 Kudos
Royi
Novice
615 Views

jimdempseyatthecove wrote:

Igor,

I think it may be best to create global variables or global bit mask then write a CPU specific filter. Use your own CPUID (or one you trust) to poll the system for supported features.

Then wse a functor to dispatch to your functions.
Initialize the functor foreach function to point to a platform determination filter

typedef void void_fn_void();
void fooSelect();
void_fn_void* foo = fooSelect;
...
void fooSelect()
{
if(HasAVX())
foo = fooAVX();
elsif(HasSSE4_1())
foo = fooSSE4_1();
...
if(HasMMX())
foo = fooMMX();
foo();
}
...
foo(); // 1st time calls fooSelect()
foo(); // remainder calls to appropriate foo without feature filtering

It will mean some work onyour part,at least until the compiler writers convert from "which processor" to "which feature".

Maybe you need to push for

__declspec(feature_specific(SSE3)) // ...

Jim Dempsey

Jim,

Is there a simple Header which have functions like 'hasSSE4()', 'hasAVX()', 'hasAVX2()', etc?
Header which is portable (Windows, macOS, Linux)?

Thank You.

0 Kudos
Reply