Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

MIC - Many Integrated Core Offloading

Eric_B_1
New Contributor I
1,651 Views

I am haveing trouble compiling a program that offloads ipp function to the mic. I just need something simple like this:

What linking and compiling options need to be enabled?

Currently it just says:

error: undefined reference to `ippsMul_32f'

<pre class="brush:cpp">

double mult_mic_ipp(float* a,float* b, float *c, int nsize)

{

    double start = 0;
    double stop = 0;
    int numt = 0;
 
#pragma offload target (mic) in(a,b:length(nsize)) out(c:length(nsize)) inout(start,stop,numt)
{
    start = omp_get_wtime();
    ippsMul_32f(a,b,c,nsize);
    stop = omp_get_wtime();
}
    return stop-start;
}

</pre>

 

0 Kudos
1 Solution
Eric_B_1
New Contributor I
1,651 Views

Thank you very much i got it working. Final code listed below. I was using a two step compile and link in QT here is the final commands:

icpc -c -fopenmp -qoffload-option,mic -O2 -falign-functions=16 -ansi-alias -fstrict-aliasing -w1 -Wall -Wcheck -wd1572,873,2259,2261 -fPIE  -I/opt/Qt5.3.2/5.3/gcc_64/mkspecs/linux-icc-64 -I/home/eric-burke/Development/c++/Mic_IPP_Final -I. -o main.o /home/eric-burke/Development/c++/Mic_IPP_Final/main.cpp

icpc -fopenmp -Wl,-rpath,/opt/Qt5.3.2/5.3/gcc_64 -o Mic_IPP_Final main.o   -qoffload-option,mic,link,-L/opt/intel/composer_xe_2015.0.090/ipp/lib/mic/\ -lipps\ -lippcore -L/opt/intel/composer_xe_2015.0.090/ipp/lib/intel64 -lipps -lippcore 

#include <iostream>

using namespace std;


#pragma offload_attribute(push, target(mic))

#include <iostream>
#include <omp.h>
#include <ipp.h>
#include <ipps.h>


void ipp_mic();
#pragma offload_attribute(pop)

__declspec(target(mic)) void ipp_mic(){
    int thread_count;
    int nsize = 10000;
    Ipp32f* a = ippsMalloc_32f(nsize);
    Ipp32f* b = ippsMalloc_32f(nsize);
    Ipp32f* c = ippsMalloc_32f(nsize);
    ippsVectorRamp_32f(a,nsize,0,1);
    ippsVectorRamp_32f(b,nsize,0,1);
    double start = omp_get_wtime();
    ippsMul_32f(a,b,c,nsize);
    double stop = omp_get_wtime();
    cout << "Vector Call: " << stop - start<< endl;

    thread_count = omp_get_num_threads();
    thread_count = omp_get_num_threads();
    int nleng = nsize/thread_count;
    if(omp_get_thread_num()==thread_count);
        nleng = nsize - (nleng*thread_count);

    start = omp_get_wtime();
        #pragma omp parallel
        {

            ippsMul_32f(a,b,c,nleng);
        }

    stop = omp_get_wtime();
    cout << "Vector Call: " << stop - start<< endl;
    #pragma omp parallel
    {
        #pragma omp single
        thread_count = omp_get_num_threads();

    }
    cout << "Thread count: " << thread_count << endl;





    cout<<c[nsize-1]<<endl;
}


int main()
{
    #pragma offload target(mic)
    ipp_mic();
}

 

View solution in original post

0 Kudos
12 Replies
Eric_B_1
New Contributor I
1,651 Views

So i found this: https://software.intel.com/en-us/node/503902

But it does not work i keep getting bash: IPPROOT: command not found...

 

0 Kudos
Anton_S_Intel
Employee
1,651 Views

Hi,

Please take a look on ipp_thread_mic example (there are both linux and windows versions). It has makefile inside where you can find real command lines for linker and compiler. Short answer - you should use -qoffload-option,mic,link,"$(IPP_LIBS)" compiler option. This will add IPP libraries to linker stage and  resolve error with undefined reference.

0 Kudos
Eric_B_1
New Contributor I
1,651 Views

I have looked at that example and it does not compile on my machine either again failing at the linking stage. I am on linux composer xe 2015.0.090.

Here is the out put

mkdir -p build
make -C ../common -f Makefile_mic.lin
make[1]: Entering directory `/opt/intel/composer_xe_2015.0.090/ipp/examples/examples/common'
mkdir -p build_mic
ar -rcs build_mic/common.a build_mic/vm_thread.o build_mic/vm_base.o build_mic/base_renderer.o build_mic/base.o build_mic/base_window_win.o build_mic/base_window_glx.o build_mic/base_image.o build_mic/base_image_bmp.o
make[1]: Leaving directory `/opt/intel/composer_xe_2015.0.090/ipp/examples/examples/common'
icpc -Qoption,link,"--no-undefined" build/ipp_thread_mic.o -o build/ipp_thread_mic ../common/build_mic/common.a -qoffload-option,mic,link,"-L/opt/intel/composer_xe_2015.0.090/ipp/lib/mic -lippcore -lippi -lipps -lippvm -lpthread"
/tmp/icpcMICpIBpIp: In function `main':
src/ipp_thread_mic.cpp:(.text+0xa52): undefined reference to `Image::~Image()'
src/ipp_thread_mic.cpp:(.text+0xa5f): undefined reference to `Image::~Image()'
src/ipp_thread_mic.cpp:(.text+0xa69): undefined reference to `DString::~DString()'

 

0 Kudos
Anton_S_Intel
Employee
1,651 Views

Confirm, this package has known issue in MIC example, sorry! Example was fixed in next version IPP 8.2.1.

Please, try this workaround - in file  ipp_thread_mic/src/ipp_thread_mic.cpp wrap function main() like this: 

#if !defined(__MIC__)
int main(int argc, char *argv[])
…
}
#endif

 

0 Kudos
Eric_B_1
New Contributor I
1,651 Views

That fixed it thanks. I was going to update to 8.1 but im on centos 7 and the installer reported unsupported os? Is there a work around on the installer?

0 Kudos
Eric_B_1
New Contributor I
1,651 Views

Sorry not fixed. Running the program reports

./ipp_thread_mic 
offload error: cannot find offload entry __offload_entry_ipp_thread_mic_cpp_921mainicpc259572409Z5Tmie
offload error: process on the device 0 unexpectedly exited with code 1

 

0 Kudos
Sergey_K_Intel
Employee
1,651 Views

Eric B. wrote:

I have looked at that example and it does not compile on my machine either again failing at the linking stage. I am on linux composer xe 2015.0.090.

Here is the out put

mkdir -p build
make -C ../common -f Makefile_mic.lin
make[1]: Entering directory `/opt/intel/composer_xe_2015.0.090/ipp/examples/examples/common'
mkdir -p build_mic
ar -rcs build_mic/common.a build_mic/vm_thread.o build_mic/vm_base.o build_mic/base_renderer.o build_mic/base.o build_mic/base_window_win.o build_mic/base_window_glx.o build_mic/base_image.o build_mic/base_image_bmp.o
make[1]: Leaving directory `/opt/intel/composer_xe_2015.0.090/ipp/examples/examples/common'
icpc -Qoption,link,"--no-undefined" build/ipp_thread_mic.o -o build/ipp_thread_mic ../common/build_mic/common.a -qoffload-option,mic,link,"-L/opt/intel/composer_xe_2015.0.090/ipp/lib/mic -lippcore -lippi -lipps -lippvm -lpthread"
/tmp/icpcMICpIBpIp: In function `main':
src/ipp_thread_mic.cpp:(.text+0xa52): undefined reference to `Image::~Image()'
src/ipp_thread_mic.cpp:(.text+0xa5f): undefined reference to `Image::~Image()'
src/ipp_thread_mic.cpp:(.text+0xa69): undefined reference to `DString::~DString()'

 

Hi Eric,

This was a problem in icl compiler of version 14.x. In 15-th compiler is should be fixed. Could you provide the exact version of icl compiler that you use (just "icl -V" output)? I'll check.

Regarding your last issue with offload compiler message, it means that the offload run-time library can't find the function "main".

0 Kudos
Eric_B_1
New Contributor I
1,651 Views

Im on linux so i assume that you mean icc but here is the output. Yes is is compiling now but not running.

Intel(R) C Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 15.0.0.090 Build 20140723

 

0 Kudos
Sergey_K_Intel
Employee
1,651 Views

Hi Eric,

In fact, in package 090 there is a problem with compiler in generation of C++ destructors in offload part. I have found a workaround in internal e-mailing. Add "-openmp" option to compilation command:

CXXFLAGS   := -c -O2 -DUSE_MIC -qopt-report-phase:offload -openmp

in Makefile. I have just checked this workaround, it works. There's a small test to see if problem is still in compiler (below). Build and run it with commands:
$ icpc test.cpp
$ ./a.out
or, if it fails during link
$ icpc -openmp test.cpp
$ ./a.out

//------------------ 
#include <stdio.h> 

class TestObject { 
public: 
    TestObject() { printf("Local object created!\n"); } 
    ~TestObject() { printf("Local object destroyed!\n"); } 
}; 

#pragma offload_attribute(push, target(mic)) 
#include <stdio.h> 
class MicObject { 
public: 
    MicObject() { printf("MIC object created!\n"); fflush(0); } 
    ~MicObject() { printf("MIC object destroyed!\n"); fflush(0); } 
}; 
#pragma offload_attribute(pop) 

__declspec(target(mic)) MicObject* pMicObject = 0; 

template<class T> 
__declspec(target(mic)) void Create() 
{ 
    pMicObject = new T; 
} 

__declspec(target(mic)) void Create() 
{ 
    Create<MicObject>(); 
} 

__declspec(target(mic)) void Destroy() 
{ 
    delete pMicObject; 
} 

int main() 
{ 
    TestObject cpu_object; 
#pragma offload target(mic) 
    Create(); 
#pragma offload target(mic) 
    Destroy(); 
} 
//----------------- 
0 Kudos
Eric_B_1
New Contributor I
1,651 Views

Yes this code compiles and works we are getting closer to what i want which is to offload IPP. But this fails to compile saying undefined reference to all things ipp. Do i need to upgrade to update 1? Will that fix the issue.

I tried to compile with this but it didn't help:

-c -O2 -DUSE_MIC -qopt-report-phase:offload -openmp -lippcore -lippi -lipps -lippvm -lpthread

#include <iostream>

using namespace std;


#pragma offload_attribute(push, target(mic))

#include <iostream>
#include <omp.h>
#include <ipp.h>
#include <ipps.h>


void ipp_mic();
#pragma offload_attribute(pop)

__declspec(target(mic)) void ipp_mic(){
    int thread_count;

    #pragma omp parallel
    {
        #pragma omp single
        thread_count = omp_get_num_threads();
    }
    cout << "Thread count: " << thread_count << endl;

    Ipp32f* f = ippsMalloc_32f(100);
    ippsVectorRamp_32f(f,100,0,1);
}


int main()
{
    #pragma offload target(mic)
    ipp_mic();
}

 

0 Kudos
Sergey_K_Intel
Employee
1,651 Views

Your example can be compiled and works with the following command line:

$ echo $IPPROOT
/opt/intel/composer_xe_2015.0.090/ipp
$ icc -I$IPPROOT/include  -O2 -openmp test.cpp -qoffload-option,mic,link,"-L$IPPROOT/lib/mic -lipps -lippcore" -L$IPPROOT/lib/intel64 -lipps -lippcore
$ ./a.out
Thread count: 240
$

Looks like compiler needs both MIC libs and host CPU libs, probably to use CPU path as a fallback if MIC is not available.

0 Kudos
Eric_B_1
New Contributor I
1,652 Views

Thank you very much i got it working. Final code listed below. I was using a two step compile and link in QT here is the final commands:

icpc -c -fopenmp -qoffload-option,mic -O2 -falign-functions=16 -ansi-alias -fstrict-aliasing -w1 -Wall -Wcheck -wd1572,873,2259,2261 -fPIE  -I/opt/Qt5.3.2/5.3/gcc_64/mkspecs/linux-icc-64 -I/home/eric-burke/Development/c++/Mic_IPP_Final -I. -o main.o /home/eric-burke/Development/c++/Mic_IPP_Final/main.cpp

icpc -fopenmp -Wl,-rpath,/opt/Qt5.3.2/5.3/gcc_64 -o Mic_IPP_Final main.o   -qoffload-option,mic,link,-L/opt/intel/composer_xe_2015.0.090/ipp/lib/mic/\ -lipps\ -lippcore -L/opt/intel/composer_xe_2015.0.090/ipp/lib/intel64 -lipps -lippcore 

#include <iostream>

using namespace std;


#pragma offload_attribute(push, target(mic))

#include <iostream>
#include <omp.h>
#include <ipp.h>
#include <ipps.h>


void ipp_mic();
#pragma offload_attribute(pop)

__declspec(target(mic)) void ipp_mic(){
    int thread_count;
    int nsize = 10000;
    Ipp32f* a = ippsMalloc_32f(nsize);
    Ipp32f* b = ippsMalloc_32f(nsize);
    Ipp32f* c = ippsMalloc_32f(nsize);
    ippsVectorRamp_32f(a,nsize,0,1);
    ippsVectorRamp_32f(b,nsize,0,1);
    double start = omp_get_wtime();
    ippsMul_32f(a,b,c,nsize);
    double stop = omp_get_wtime();
    cout << "Vector Call: " << stop - start<< endl;

    thread_count = omp_get_num_threads();
    thread_count = omp_get_num_threads();
    int nleng = nsize/thread_count;
    if(omp_get_thread_num()==thread_count);
        nleng = nsize - (nleng*thread_count);

    start = omp_get_wtime();
        #pragma omp parallel
        {

            ippsMul_32f(a,b,c,nleng);
        }

    stop = omp_get_wtime();
    cout << "Vector Call: " << stop - start<< endl;
    #pragma omp parallel
    {
        #pragma omp single
        thread_count = omp_get_num_threads();

    }
    cout << "Thread count: " << thread_count << endl;





    cout<<c[nsize-1]<<endl;
}


int main()
{
    #pragma offload target(mic)
    ipp_mic();
}

 

0 Kudos
Reply