Software Archive
Read-only legacy content
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
17060 Discussions

_Cilk_shared and STL

Georg_V_
Beginner
1,296 Views

I am porting a large C++ application to MIC, and I would like to use _Cilk_shared to transport the data between host and Xeon Phi. Of course, I would like to avoid rewriting the whole code, so what I do is this in the .cpp file that implements myClass:

[cpp]#pragma offload_attribute(push, _Cilk_shared)

#include <myClass.h>

#pragma offload_attribute(pop, _Cilk_shared)

...

void myClass::myMethod(){

std::vector<bool> localData;

....[/cpp]

myClass.h contains this:

[cpp]#include <vector>

class myClass {

....

void myMethod();

std::vector<float> m_floatData;

};[/cpp]

Now I have a dilemma:

Is there any simple solution to this?

Georg

 

 

 

 

 

0 Kudos
11 Replies
Sumedh_N_Intel
Employee
1,296 Views

Hi,

I am curious as to why you would want localData to be _Cilk_shared? Could you please share how you intend to use the localData?

-Sumedh

0 Kudos
Andrey_Vladimirov
New Contributor III
1,296 Views

Hi Georg,

I had a similar problem in a code that I was porting recently, only my code was using valarrays rather than vectors to hold the data. It was my understanding that it is impossible to make a class _Cilk_shared if some of the members are non-_Cilk_Shared objects, like std::vectors. However, even if you made this happen (e.g., in a native application), there are practical disadvantages of using vectors or valarrays in Xeon Phi code:

a) With data in vectors or valarrays, you have no control over the alignment of data in memory. This can result in moderate to severe performance penalties on Xeon Phi coprocessors.

b) When you do seemingly harmless operations with vectors, such as the creation of a temporary vector on a function stack, or using push_back, you occasinally trigger dynamic memory allocation in the vector class. This operation is inherently sequential and may have a really bad effect on the performance on the coprocessor (it was quite severe in my application).

So, the way that I see it, there are three possible ways to port this code:

1) Implement your own class MyVector analogous to std::vector, and ensure that it allocates data on a 64-byte boundary, and that it does not use _mm_malloc() when you don't need it.

2) Create a derived class myClassPort : public myClass. The constructor of myClassPort should copy all data in from the vectors of myClass into arrays of float. This gives you control over alignment and, at the same time eliminates the overhead of abstraction in the performance-critical part. After that, you can use the explicit offload model (with "#pragma offload") to launch calculations on the coprocessor. This is the method that I chose for my code, because I wanted the best control over data allocation and transport.

3) Of course, you can also compile a native application without any code changes, but I don't know if this is a good option for your application. It was not for mine.

Andrey

0 Kudos
Georg_V_
Beginner
1,296 Views

Sumedh Naik (Intel) wrote:

I am curious as to why you would want localData to be _Cilk_shared? Could you please share how you intend to use the localData?

I dont want localData t be _Cilk_shared. But because member m_floatData is part of a _Cilk_shared class, std::vector also needs to be _Cilk_shared, and therefore I cannot avoid localData to be _Cilk_shared. Or am I missing something here?

Georg 

0 Kudos
Sumedh_N_Intel
Employee
1,296 Views

Hi Goerg, 

I now understand the issue. In this case, instead of marking the entire class as shared, you use a shared allocator defined in offload.h to create a shared vector object. Here is an example: 

#include <vector>

_Cilk_shared class myClass {

....

void myMethod();

_Cilk_shared std::vector<float, __offload::shared_allocator<int>> _Cilk_shared m_floatData;

};

Here is another example, I found which instantiates and manipulates shared versions of C++ STL vectors. 

#include <vector>
#include <offload.h>
#include <stdio.h>

 

using namespace std;

 

typedef vector<int, __offload::shared_allocator<int> >

 

shared_vec_int;

_Cilk_shared shared_vec_int * _Cilk_shared v;

 

_Cilk_shared int test_result() {

  int result = 1;

   for (int i = 0; i < 5; i++) {
      if ((*v) != i) {
         result = 0;
      }
   }

   return result;
}

 

int main() {

 

   int result;

 

   v = new (_Offload_shared_malloc(sizeof(vector<int>))) _Cilk_shared vector<int,__offload::shared_allocator<int>>(5);

 

   for (int i = 0; i < 5; i++) {
      (*v) = i;
   }

 

   result = _Cilk_offload test_result();

 

   if (result != 1)
      printf("Failed\n");
   else
      printf("Passed\n");

 

   return 0;
}

0 Kudos
Sumedh_N_Intel
Employee
1,296 Views

Hi Goerg, 

I now understand the issue. In this case, instead of marking the entire class as shared, you can use a shared allocator (defined in offload.h) to create a shared vector object. Here is an example: 

#include <vector>

_Cilk_shared class myClass {

....

void myMethod();

_Cilk_shared std::vector<float, __offload::shared_allocator<int>> _Cilk_shared m_floatData;

};

This is another example I found that instantiates and manipulates shared versions of C++ STL vectors. 

#include <vector>
#include <offload.h>
#include <stdio.h>

using namespace std;

typedef vector<int, __offload::shared_allocator<int> >

shared_vec_int;

_Cilk_shared shared_vec_int * _Cilk_shared v;

_Cilk_shared int test_result() {

  int result = 1;

   for (int i = 0; i < 5; i++) {
      if ((*v) != i) {
         result = 0;
      }
   }

   return result;
}

int main() {

   int result;

   v = new (_Offload_shared_malloc(sizeof(vector<int>))) _Cilk_shared vector<int,__offload::shared_allocator<int>>(5);

   for (int i = 0; i < 5; i++) {
      (*v) = i;
   }

   result = _Cilk_offload test_result();

   if (result != 1)
      printf("Failed\n");
   else
      printf("Passed\n");

   return 0;
}

I hope this helps. 

-Sumedh

0 Kudos
Georg_V_
Beginner
1,296 Views

Sumedh Naik (Intel) wrote:

...I now understand the issue. In this case, instead of marking the entire class as shared, you can use a shared allocator (defined in offload.h) to create a shared vector object. Here is an example: 

....

Okay, I think I understand. Let me do a few tests...

Georg

0 Kudos
Dave_O_
Beginner
1,296 Views

where is _Cilk_shared? what header file do I need to include to use it. I can see other cilk contructs when I begin to type in visual studio (intellisense) but not _cilk_shared.

0 Kudos
Kevin_D_Intel
Employee
1,296 Views

There is no header needed for the keywords. You might include <offload.h> to use other aspects of the shared offload model; however, I believe we may be lacking defines in the <cilk/cilk.h> to enable the intellisense. I'm checking w/others about this.

0 Kudos
Kevin_D_Intel
Employee
1,296 Views

Our IDE integration Developer clarified the keyword highlighting and IntelliSense support.

Currently, _Cilk_for, _Cilk_spawn, _Cilk_sync are highlighted as compiler keywords in the C++ editor and that is the extent of the support that we can provide for Intel C++-specific keywords in the Visual Studio editor. There is no auto-completion or any other IntelliSense support for  _Cilk_for, _Cilk_spawn, _Cilk_sync because the Visual C++ IntelliSense is not extensible, unfortunately. The contents of <cilk/cilk.h> are not relevant to IntelliSense either; however, this header triggers auto-completion for cilk_spawn, cilk_sync, cilk_for when it is included.

Further, we do not currently highlight the keywords from the offload Virtual shared model (_Cilk_shared, _Cilk_offload, _Cilk_offload_to) so I submitted a feature enhancement (see internal tracking id noted below) to have those highlighted similar to the other _Cilk keywords.

(Internal tracking id: DPD200255317)

0 Kudos
Jun
Beginner
1,296 Views

Hi . I tried Sumedh Naik 's method of allocating a std::vector

and I am getting a link error from compiler ,

Here is compile command , it is similar for other .cpp file:

 icpc -openmp -std=c++11 -xhost -Wno-unknown-pragmas -no-offload  -opt-report-phase=offload -vec-report3 -O3 -g -o haar.o -c haar.cpp

Here is my linker command : 

icpc -o vj main.o haar.o image.o stdio-wrapper.o rectangles.o  -lm -mkl

Here is the error I get:

main.o: In function `main':
/work/02645/pan19/project/xeon_offload/main.cpp:49: undefined reference to `__intel_new_feature_proc_init'
haar.o: In function `detectObjects':
/work/02645/pan19/project/xeon_offload/haar.cpp:135: undefined reference to `_Offload_shared_malloc'
haar.o: In function `__offload::shared_allocator<MyRect>::allocate(unsigned long, void const*)':
/opt/apps/intel/13/composer_xe_2013.2.146/compiler/include/offload.h:417: undefined reference to `_Offload_shared_malloc'
haar.o: In function `__offload::shared_allocator<MyRect>::deallocate(MyRect*, unsigned long)':
/opt/apps/intel/13/composer_xe_2013.2.146/compiler/include/offload.h:426: undefined reference to `_Offload_shared_free'
/opt/apps/intel/13/composer_xe_2013.2.146/compiler/include/offload.h:426: undefined reference to `_Offload_shared_free'
rectangles.o: In function `MyRect* std::__copy_move<true, true, std::random_access_iterator_tag>::__copy_m<MyRect>(MyRect const*, MyRect const*, MyRect*)':
/usr/include/c++/4.4.7/bits/stl_algobase.h:378: undefined reference to `__intel_ssse3_rep_memmove'
rectangles.o: In function `MyRect* std::__uninitialized_move_a<MyRect*, MyRect*, std::allocator<MyRect> >(MyRect*, MyRect*, MyRect*, std::allocator<MyRect>&)':
/usr/include/c++/4.4.7/bits/stl_algobase.h:378: undefined reference to `__intel_ssse3_rep_memmove'
rectangles.o: In function `int* std::__copy_move<true, true, std::random_access_iterator_tag>::__copy_m<int>(int const*, int const*, int*)':
/usr/include/c++/4.4.7/bits/stl_algobase.h:378: undefined reference to `__intel_ssse3_rep_memmove'
/usr/include/c++/4.4.7/bits/stl_algobase.h:378: undefined reference to `__intel_ssse3_rep_memmove'
/usr/include/c++/4.4.7/bits/stl_algobase.h:378: undefined reference to `__intel_ssse3_rep_memmove'
/usr/bin/ld: link errors found, deleting executable `vj'
make: *** [vj] Error 1

Im wondering if anyone can help me? I am using TACC stampede supercomputer

 

0 Kudos
Jun
Beginner
1,296 Views

following last comment 

my icc version : Intel(R) C Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 13.1.0.146 Build 20130121

I tried 14.0.1 as well. same error. 

I saw in some post people are using #define"offload.h" instead of #define<offload.h> I am wondering where they get that offload.h from ?

0 Kudos
Reply