Software Archive
Read-only legacy content
17061 Discussions

Is it possible to offload a string array to xeon phi

Po_Chang_W_
Beginner
1,416 Views

I want to get all substring of a string on xeon phi
First, I read a txt file from args and store it into a pointer array like this

char *temp_string[N_ELEMENT];

Second, I want to using pragma offlad to copt this array to xeon phi like this

#pragma offload target(mic: 0) in(temp_string:length(N_ELEMENT))

The command line give me the following message

error: variable "temp_string" used in in/out/inout clause is an array whose underlying type is "char *"

Is it have any solutation to fix my problem?


The offload section like this:

#pragma offload target(mic: 0) in(temp_string:length(N_ELEMENT))
#pragma omp parallel for private(c, i, length)
for(n = 0; n < N_ELEMENT; ++n) 
{
    length = strlen(temp_string[n]);

    for( c = 0 ; c < length ; c++ )
        for( i = 1 ; i <= length - c ; i++ )
        {
            sub = substring(temp_string[n], c+1, i);
            printf("%s \n", sub);
        }

}
0 Kudos
1 Solution
Kevin_D_Intel
Employee
1,416 Views

The Intel® Parallel Studio XE 2015 (15.0 compiler) release contains new feature support for passing an array of pointers as discussed in this article:

Data transfer of an “array of pointers” using the Intel® Language Extensions for Offload (LEO) for the Intel® Xeon Phi™ coprocessor

This slipped my mind but it is behind the different treatment I noted between the two compilers earlier. With this feature support I was able to mock-up your sample and get it working rather easily. For the prototype you posted, with 15.0 you only need to change your #pragma offload to this:

#pragma offload target(mic: 0) in(temp_string[0:N_ELEMENT])

The offload run-time dynamically determines the length of each string element and transfers each element accordingly. The article mentions but does not detail use with char data types specifically so we'll look to update that soon.

 

View solution in original post

0 Kudos
15 Replies
jimdempseyatthecove
Honored Contributor III
1,416 Views

The problem you have is you are attempting to copy the array of char*'s and not also that which they point to.

If these are substring pointers then consider:

char *big_string;
int tokenIndex[N_ELEMENT];
..

Then offload the big_string and tokenIndex arrays (don't forget the length+1 of the big_string).

Jim Dempsey

0 Kudos
Kevin_D_Intel
Employee
1,416 Views

I am seeing different treatment between our 14.0 and 15.0 compilers so I expect this is an intentional restriction imposed by the Intel Composer XE 2013 SP1 release (14.0 compiler) that may have been eased/lifted in the Intel Parallel Studio XE 2015 (15.0) compiler.

The error is reproducible with the 14.0 compiler but not the 15.0 compiler so if it was convenient for you to upgrade to the newer 15.0 release then you may find it supports your needs/interests. Using 14.0 may require a different approach, perhaps as Jim suggested.

I will inquire with our Developers about this error and share what else I learn.

0 Kudos
Po_Chang_W_
Beginner
1,416 Views

Hi, jimdempseyatthecove
I try your method and success, thank you very much.

By the way, I want to know how to offload a struct array?

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,416 Views

If the struct is POD (just data. POD=Plain Old Data) then the pointer and sizeof the struct should work.

However, structs can contain pointers to other objects, member functions, (in addition to ctor/dtor), and virtual functions, and these complicate the offload. Some of this management is aided with the 15.0 compiler, however, I am unable to provide a working example because I do not have the 15.0 compiler.

Look at a few of the other threads on this forum, there was one showing a simple example where a member function was offloaded where the member variable was an array. The fix was to remember to offload the this pointer with nocopy. As to if this worked, I did not see the reply. There should be some examples that you can find with a google search of this forum. Use the following search:

    C++ class offload site:https://software.intel.com/en-us/forums

Jim Dempsey

0 Kudos
Kevin_D_Intel
Employee
1,417 Views

The Intel® Parallel Studio XE 2015 (15.0 compiler) release contains new feature support for passing an array of pointers as discussed in this article:

Data transfer of an “array of pointers” using the Intel® Language Extensions for Offload (LEO) for the Intel® Xeon Phi™ coprocessor

This slipped my mind but it is behind the different treatment I noted between the two compilers earlier. With this feature support I was able to mock-up your sample and get it working rather easily. For the prototype you posted, with 15.0 you only need to change your #pragma offload to this:

#pragma offload target(mic: 0) in(temp_string[0:N_ELEMENT])

The offload run-time dynamically determines the length of each string element and transfers each element accordingly. The article mentions but does not detail use with char data types specifically so we'll look to update that soon.

 

0 Kudos
Po_Chang_W_
Beginner
1,416 Views

Thanks everyone to help me, I try to using a big char array(string) and a index array(int array) to offload it to help me get all of my string from my string array, when my array is small, it will work great. But when I add the number of element to the string array, the big char array become big and big, and offload will happened "process on the device 0 was terminated by signal 11 (SIGSEGV)", I have no idea to fix it, can someone help me and give me some top to fix it.

And the second question is, I using the variable on offload block, but I didn't write the closure "in out inout" like this, it still can read the variable, why? If I only want to read it not modified all of my data, can I using it without copy?

Thanks everyone.

My code just like this:

<pre class="brush:cpp">

    __attribute__((target(mic))) char offload_pattern_string[30 * PATTERN_ELEMENT];
    __attribute__((target(mic))) int offload_pattern_index[PATTERN_ELEMENT], n=0;
    __attribute__((target(mic))) char offload_content_string[60 * CONTENT_ELEMENT];
    __attribute__((target(mic))) int offload_content_index[CONTENT_ELEMENT];
    

    while(fgets(line, 1024, fp)) {
        if((p=strchr(line, '\r'))) *p='\0';
        if((p=strchr(line, '\n'))) *p='\0';

        /*bloom_add(bloom, line);*/

        p=strtok(line, " \t,.;:\r\n?!-/()");
        while(p) {
            
            strcat(offload_pattern_string, p);
            offload_pattern_index[n++] = strlen(p);
            p=strtok(NULL, " \t,.;:\r\n?!-/()");
        }
        
    }

</pre>

and let the offload_pattern_string to MIC

0 Kudos
Kevin_D_Intel
Employee
1,416 Views

By default, any variable the compiler can see within the lexical scope of the offload construct is transferred implicitly as INOUT. So the variable's value by default is available and that's why you can read it without listing the variable in any explicit data transfer clause.

To use a variable's host value on the coprocessor only you must at least transfer the value once using IN. Subsequent offloads can then reuse the variable's value on the coprocessor by using NOCOPY for that variable.

It is difficult to know for certain the cause of the SIGSEGV without a reproducer.

What are the values for PATTERN_ELEMENT and COUNT_ELEMENT that cause the failure?

0 Kudos
Po_Chang_W_
Beginner
1,416 Views

PATTERN_ELEMENT means the number of my pattern, read from file line by line

So, I need to transfer they using IN and then reuse they using no copy right?

And I want to ask how lager variable can I transfer into Xeon Phi, is it have any restriction?

thank you very much!

0 Kudos
Kevin_D_Intel
Employee
1,416 Views

The guidance I received is that in theory there is no limit/restriction on size of any single variable transferred or the aggregate size of transferring multiple variables with a single offload/offload_transfer.

For the IN and NOCOPY usage, you could perhaps do something like this:


…       // some host code

#pragma offload_transfer IN(<variables>)     // offload_transfer can be used to only upload a variable’s value
                                                                        // This could also be a #pragma offload with other offload code if desired

…       // some more host code

#pragma offload NOCOPY(<variables>)        // This might be where your earlier offload/omp code snippet could appear
{
    // offload code to use the variable’s value.
}

…       // some more host code

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,416 Views

>>By default, any variable the compiler can see within the lexical scope of the offload construct is transferred implicitly as INOUT
...+

 __attribute__((target(mic))) char offload_pattern_string[30 * PATTERN_ELEMENT];
...

If these variables are local scoped variables (they appear to be on stack), then the stack size on the MIC portion of the application must be large enough to accommodate the variables. I do not know how big  PATTERN_ELEMENT and CONTEN_ELEMENT are. If these are large, then this may be a stack issue. If so, consider making these global or allocatable.

Jim Dempsey 
 

 

0 Kudos
Rishab_G_
Beginner
1,416 Views

Hello,

This is Rishab here. I am working on Intel Xeon Phi offload model in OPENMP .I needed to offload the for loop across get_orientation() an image data array named in the code(actually in):  img->imageData and array of structures named( actually in& out): arr_ipts .Please suggest if there is any correction in  the piece of code or additional information required. I am enabled OFFLOAD_REPORT=2, but can't see any data transfer messages. The cod snippet and message log are below:

The message log :

[Offload] [MIC 0] [File]                    src/surf.cpp
[Offload] [MIC 0] [Line]                    81
[Offload] [MIC 0] [Tag]                     Tag 0

 

The piece of code I want to offload:

{

     Ipoint ** arr_ipts = (Ipoint **)malloc(ipts.size()*sizeof(Ipoint *));
      int b = sizeof(Ipoint);
      for (int i = 0; i < ipts.size(); i++)
      {
          arr_ipts = (Ipoint *)malloc(sizeof(Ipoint));
          memcpy(arr_ipts, &ipts, sizeof(Ipoint));
      }
   float * tData  =  img->imageData;
   int k  = sizeof(Ipoint);
   int w = img->width;
   int h = img->height;
   int step  = img->widthStep;
    // Main SURF-64 loop assigns orientations and gets descriptors
    printf("offload start\n");
#pragma offload target(mic:0) in(tData:length(w*h)) in(*arr_ipts[0:(size)]:length(k)) out(*arr_ipts[0:(size)]:length(k)) 
      {
          for (int i = 0; i < ipts_size; ++i)
          {
              // Set the Ipoint to be described
              int idx = i;
     const int deviceNum = _Offload_get_device_number();
     //printf("Hello from coprocessor %d \n" , deviceNum);
     //fflush(stdout);
              // Assign Orientations and extract rotation invariant descriptors
              getOrientation(tData, w, h, step, arr_ipts);
              //getDescriptor(img->imageData, img->width, img->height, img->widthStep,false);
       
          }
      }

Regards

Rishab Goel

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,416 Views

In your above code you are

allocate array of Ipoint pointers ipts.size()*sizeof(Ipoint *))
for each allocated pointer
  allocate space for Ipoint object
  copy Ipoint object from container to newly allocated space
end for
offload (your attempt at array of pointers)

Change this to

allocate blob of memory ipts.size()*sizeof(Ipoint)
allocate array of Ipoint pointers ipts.size()*sizeof(Ipoint *))
for each allocated pointer
  array of IpointPointers[index] = blob + index * sizeof(Ipoint)
  copy Ipoint object from container to newly pointed to space
end for
offload (the blob and count, etc but not the pointers
{ in offload)
allocate array of Ipoint pointers ipts.size()*sizeof(Ipoint *))
for each allocated pointer
  array of IpointPointers[index] = blob + index * sizeof(Ipoint)
end for
now use the array of objectes via the pointers
}

Note, the above assumes your code requires the array of pointers inside the offload. You could just as well allocated an array of type Ipoint and directly passed and used that.

Jim Dempsey

0 Kudos
Rishab_G_
Beginner
1,416 Views

Hello Jim ,

Thanks for the suggestion .

I tried your suggestion but experience the error as mentioned below. Please could u suggest if anything I did was wrong.Also added the code snippet below:

Error:

[Offload] [MIC 0] [File]                    src/surf.cpp
[Offload] [MIC 0] [Line]                    82
[Offload] [MIC 0] [Tag]                     Tag 0
offload error: process on the device 0 was terminated by signal 11 (SIGSEGV)

Code Snippet:

********************************

        Ipoint * t_ipts = &ipts[0];
        int size = ipts.size();
        //Ipoint ** arr_ipts = (Ipoint **)malloc(ipts.size()*sizeof(Ipoint *));
        int b = sizeof(Ipoint);
        uint8_t * mem_chunk = (uint8_t*) malloc(sizeof(Ipoint)*ipts.size());

        for (int i = 0; i < ipts.size(); i++)
        {
            //arr_ipts = (Ipoint*) ;
            memcpy((mem_chunk + i*sizeof(Ipoint)), &ipts, sizeof(Ipoint));
        }
        float * tData = img->imageData;
    int w = img->width;
    int h = img->height;
    int step = img->widthStep; 
    printf ( "offload started \n" );
        // Main SURF-64 loop assigns orientations and gets descriptors
#pragma offload target(mic:0) in(tData:length(img->width*img->height)) inout(mem_chunk:length(size*b)) optional
        {
      Ipoint ** arr_ipts = (Ipoint **)malloc(size*sizeof(Ipoint *));
            for (int i = 0; i < ipts_size; ++i)
            {
                // Set the Ipoint to be described
                int idx = i;
        printf ( "offload ipoint \n" );
                arr_ipts = (Ipoint*)(mem_chunk + i*sizeof(Ipoint));
                // Assign Orientations and extract rotation invariant descriptors
                getOrientation(tData, img->width, img->height, img->widthStep, arr_ipts);
                //getDescriptor(img->imageData, img->width, img->height, img->widthStep,false);
            }
      free(arr_ipts);
        }
        for (int i = 0; i < ipts.size(); i++)
        {
            memcpy(&ipts, (mem_chunk + i*sizeof(Ipoint)), sizeof(Ipoint));
            //free(arr_ipts);
        }
        free(mem_chunk);
        

********************************

Regards

Rishab

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,416 Views

I think the issue is you need to pass into the offload the values of (and use the values of) img->width, img->height, img->widthStep

Also, as written, you do not need lpoint** arr_ipts (though your actual code may require the array of pointers later). Note, you can construct the individual pointer with a cast and  

lpoint* arr_lpts =(lpoint*)mem_chunk; // pointer to base of array of lpoints
...
...arr_lpts...;

Then, if you continue to get SIGSEGV, you may have to see if you have alignment expectations that are not met. If so, then you may need to insert inter lpoint pad bytes within the blob and then define a lpoint_padded object with lpoint type as base and pad as pad amount. Then use that when copying into blob and within the offload.

Jim Dempsey

0 Kudos
Rishab_G_
Beginner
1,416 Views

Thanks Jim for the suggestion I made img->width , img->height scalars and that worked.!!!

0 Kudos
Reply