Software Archive
Read-only legacy content
17061 Discussions

Data persistence on MIC and "nocopy"

Dorian_C_1
Novice
652 Views

Hi all!

I am trying to keep transfers between host and target as small as possible by using data persistence with alloc_if/free_if tags in the offload model.

When using "inout", "in" or "out" (meaning: if I do a transfer of the data in any direction) everything is fine but if I use the "nocopy" tag the program crashes in coi_host.dll (0xC0000005: Access violation reading location 0x0000000000000000).
I guess this is because the data is not found even though it was declared with ALLOC previously...

The pointer to the data I reuse is not defined in file scope but as I understand the following document, data managed by alloc_if/free_if is heap-allocated and as such is persistent (no need to have data declared as "static" in the file scope)
https://software.intel.com/en-us/articles/effective-use-of-the-intel-compilers-offload-features

My code is something like this:
 

void alloc(float* data, int32_t size){
    #pragma offload_transfer target(mic) in(data:length(size) ALLOC)
}

void compute(float* data, int32_t size){
    #pragma offload target(mic) nocopy(data:length(size) REUSE) //When using "in"/"out"/"inout" here, no crash
    {
        //some computation to fill "data"
    }
}

int main(){
     int32_t size = 10;
     float* data = new float[size];
     alloc(data, size);
     compute(data, size);
}

I guess using the Cilk API instead of the offload API could solve my problem because data is duplicated in the background at the same virtual addresses but if possible I would prefer to keep this model and not recode everything...

Any insights?

0 Kudos
9 Replies
Dorian_C_1
Novice
652 Views

I may have found a solution:

The behavior is a little odd to me, but it seems that we cannot use "nocopy" with REUSE.
Instead, we need to use "in" to make the pointer available in the Xeon Phi, and set the "length" to 0 to prevent any data from being copied, like this:

#pragma offload target(mic) in(data:length(0) REUSE)

"nocopy" would have been more logical but at least it seems to be working...
(Note that it works also with "out" and "inout", which is also why the logic eludes me)

0 Kudos
Gregg_S_Intel
Employee
652 Views

There is no guarantee the pointer data in compute() is the same as pointer data in alloc()..  So these are two separate offloads.

0 Kudos
Dorian_C_1
Novice
652 Views

In the main function if I declare the pointer data for the mic target like this, will it guarantee the pointer to be the same?

#pragma offload_attribute(push, target(mic))
float* data;
#pragma offload_attribute(pop)

 

0 Kudos
Gregg_S_Intel
Employee
652 Views

Nevermind, I was thinking of an older restriction which not longer applies.  The solution you found is correct.

As written, the offloads could go to different MIC cards, which would fail.  Suggest specifying target(mic:0).

 

 

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
652 Views

The pointer data would have to be in global scope and NOT passed as an argument across an offload.

If (when) multiple data arrays need to be managed, and passed by argument by host, then you need to perform the allocations on MIC, and return the pointer back to Host (for use in subsequent calls). The offloads pass the pointer (into the MIC) on subsequent offloads.There are examples of this in Reinders book.

 Jim Dempsey

0 Kudos
Rajiv_D_Intel
Employee
652 Views

The general rule when dealing with persistent dynamic data is that when a pointer is used for the first time on MIC it needs to receive the MIC data address from the host, otherwise it will be uninitialized. The address may be received by new memory allocation or by using IN with a 0 data length.

Statically allocated pointers (i.e. file-scope pointers), persist for the life of the program. Once assigned on MIC they contain valid pointer values till program termination.

When using a pointer function parameter, the first use of the parameter must use "IN" to get the MIC data address into it.

Similarly, when data is allocated on MIC using pointer p, and p is assigned to another pointer variable q on the CPU, the first use of q on MIC should use "IN", otherwise q will be uninitialized on MIC. This applies whether q is a static variable or automatic variable.

The example below illustrates the two situations. The commented-out pragmas will cause a program error. The uncommented pragmas lead to successful execution.

void foo(int *p)
{
        // First use of p, must use "IN"
        //#pragma offload target(mic) nocopy(p:alloc_if(0) free_if(0))
        #pragma offload target(mic) in(p[0:0]:alloc_if(0) free_if(0))
        {
                printf("p=%p, p[0]=%d\n", p, p[0]);
                fflush(0);
        }

        // Second use of p, OK to use nocopy
        #pragma offload target(mic) nocopy(p:alloc_if(0) free_if(0))
        {
                printf("p=%p, p[0]=%d\n", p, p[0]);
                fflush(0);
        }
}

int main()
{
        int *p;
        int *q;

        p = malloc(4);
        p[0] = 55;

        // First use of p, must use "IN"
        #pragma offload target(mic) in(p[0:1]: alloc_if(1) free_if(0))
        {
                printf("p=%p, p[0]=%d\n", p, p[0]);
                fflush(0);
        }

        q = p;

        // First use of q, must use "IN"
        //#pragma offload target(mic) nocopy(q: alloc_if(0) free_if(0))
        #pragma offload target(mic) in(q[0:0]: alloc_if(0) free_if(0))
        {
                printf("q=%p, q[0]=%d\n", q, q[0]);
                fflush(0);
        }

        // Second use of q, OK to use nocopy
        #pragma offload target(mic) nocopy(q: alloc_if(0) free_if(0))
        {
                printf("q=%p, q[0]=%d\n", q, q[0]);
                fflush(0);
        }
        foo(p);
        return 0;
}

 

0 Kudos
Dorian_C_1
Novice
652 Views

Thank you all for your help!

Gregg, thank you for the precision about mic:0, I have only one MIC so I had no problem but you are right, this code is not scalable!
Thanks!

Jim, your solution seems to be the most manual and safe to use, if we allocate the memory on the MIC with "malloc" or "new" and release it also on the MIC with "free" or "delete", it makes sense that the data will persist on the MIC as we need it. This code would then work I guess:
 

void alloc(float* data, int32_t size){
     //Allocate the memory on MIC (syntax from Intel's tutorial)
     #pragma offload target(mic:0) nocopy(data)
     {
          data = (float *)malloc(size*sizeof(float));
     }
     //If needed, initialize the memory using a transfer
     #pragma offload_transfer target(mic:0) in(data:length(size) alloc_if(0) free_if(0))
 }
void compute(float* data, int32_t size){
     #pragma offload target(mic:0) nocopy(data)
     {
         for(int i=0;i<size;i++) data=???
     }
}

Rajiv, in my case file-scope pointers are not a valid solution because this code is part of a larger c++ code maintaining the state of data using objects. I guess I could rewrite it to make parts shared with MIC static but it would augment the risks of bugs (any object could access the memory of other objects for example).
About your code proposal, do you mean that in each function using the data pointer, we have to treat the first call and following calls differently. Can't I just treat the first call and following calls in different functions since the pointer value is the same?

 

0 Kudos
Rajiv_D_Intel
Employee
652 Views

The code you've written above will not work. There are several problems with it:

1. The variable "data" in the two functions "alloc" and "compute" are unrelated. They are in effect separate local variables of the two functions.

2. In the function "alloc", after you allocate "data" on MIC with malloc, you will not be able to transfer data into it using the pragma. For MIC-local allocations done using malloc you must use the "targetptr" feature of the #pragma offload syntax. However, I don't think you should go there.

Your best bet is to continue allocating on the CPU, and doing in(p[0:0] REUSE) at first use within any function. Later calls may also use in(p[0:0] REUSE) safely, but only the first pragma in any new context (such as a new function) must use IN.

0 Kudos
Dorian_C_1
Novice
652 Views

Thank you Rajiv.

Right now, I do not seem to experience any problems using nocopy in the first use when I allocate the memory like this:

#pragma offload_transfer target(mic) nocopy(data:length(size) alloc_if(1) free_if(0))

but just to be on the safe side, I will use in before using nocopy in each function. 

0 Kudos
Reply