Software Archive
Read-only legacy content
17061 Discussions

Offload one function with two versions (1 CPU, 1 MIC)

vincent_b_
Beginner
521 Views

Hello,

I'm working on a project with a function which has :
- an optimized version for MIC with MIC intrinsics
- an optimized version for CPU with AVX intrinsics

I'd like to make my project compatible for :
- CPU run only
- offload run with CPU+MIC.

In order to do so I tried to use "__attribute__ ((target(mic))) function" so that :
- when offloading, the function is compiled for MIC
- when not offloading (-nooffload), the CPU version is compiled.

However, as earlier mentionned my function has two different versions : one with MIC intrinsics and one with AVX intrinsics, so how can I declare a function for offload with two versions of that function ?

Thanks in advance,

Vince

0 Kudos
7 Replies
Kevin_D_Intel
Employee
521 Views

You can use the __MIC__ predefine as shown in the Xeon Phi™ sample, sampleC06.c  (Product samples are found under the default install location: /opt/intel/composer_xe /Samples/en_US/C++/mic_samples/intro_sampleC/)

0 Kudos
vincent_b_
Beginner
521 Views

Hello Kevin,

Thank you for your quick answer.
Sample06.c anwsers our question pretty well.
However, we forgot to mention that the arguments' type that we give to our function differs between MIC and CPU because of vector size.

Do you know how we can define different types for the arguments of the function depending wether it runs on MIC or CPU ?

Thank you.

0 Kudos
Kevin_D_Intel
Employee
521 Views

When building with offload enabled, calling a function from with a #pragma offload construct does not guarantee the function executes on the coprocessor.  The decision to offload is made at run-time, not compile-time.

With the offload OPTIONAL clause (default w/13.x compilers) the function call within an offload construct executes on the CPU when no co-processor is available/present. With the offload MANDATORY clause (default w/14.0 compilers) execution of the offload construct aborts when no co-processor is available/present (w/14.0 the STATUS clause returns control to the user app for action vs. aborting).

You noted compiling your code for CPU and CPU+MIC. You can compile your code with offload enabled and disable offload at run-time with OFFLOAD_DEVICES= (an empty value) so compiling with -no-offload might not be necessary.

I do not know how to structure code for the compiler to make a compile-time decision for calling a function with different arguments where a different code path in the function body executes is based on a run-time decision. I’ll check w/our C/C++ experts about possibilities.

0 Kudos
Kevin_D_Intel
Employee
521 Views

I need to test this method; however, I’m running short on time today and wanted to share at least the outline offered by our developers for a method that may work for you.
 
Their suggestion is for using a “gateway method”; one “where the entry into MIC is through a function that does not have different parameters, and then the special function is called.”

The outline looks like this:

[cpp]#ifdef __MIC__  
    typedef  __m512   VEC_TYPE;
#else
    typedef __m256  VEC_TYPE;
#endif
 
__declspec (target(mic))
vect_func (VEC_TYPE  arg1)
{
}
 
__declspec (target(mic))
gw_func()
{
#ifdef __MIC__
     <prepare arg1 as __m512 type>
#else
     <prepare arg1 as __m256 type>
#endif
 
  vect_func(arg1);    // invoke function
}
 
int main()
{
#pragma offload target(mic)
       gw_func();
}[/cpp]

I will test this myself and post an update.

0 Kudos
Nathanael_S_
Beginner
521 Views

Kevin Davis (Intel) wrote:

I need to test this method; however, I’m running short on time today and wanted to share at least the outline offered by our developers for a method that may work for you.

Surprisingly*, this seems to work.
Thanks !

* I'm surprised because the typedef appears outside the function, so I expected  __MIC__ to be not defined...

0 Kudos
vincent_b_
Beginner
521 Views

Thank you Kevin.

0 Kudos
Kevin_D_Intel
Employee
521 Views

You’re welcome Vincent.

Nathanael, the __MIC__ define is active during the MIC-side compilation only (not the host-side). Unless excluded by the macro, all source in the file is visible to the MIC compiler; however, the compiler will only emit those entities marked with __declspec(target(mic)) -or- __attribute__ ((target(mic))) into the MIC object file.

0 Kudos
Reply