Software Archive
Read-only legacy content
17061 Discussions

A Problem of Separating nocopy pragma into Different Functions/Files

Jiawen_L_
Beginner
590 Views

Hi everyone,

I met a problem as I separate the function void f() from original function void axpy(). Because I have to separate the whole process of in/out/inout to calculate the time profiling, I have to use the nocopy pragma to separate the whole process. When I run the code following, the error happens: offload error: process on the device 0 was terminated by signal 11 (SIGSEGV). But If I use the function inside the function axpy(the commented out codes), the error disappears. Does that mean that period/time range of hash table(inside Xeon Phi) just will end beside a function or others reasons? Is there other solutions to use nocopy pragma(separate allocate->copy to mic->run kernal->copy result back to host->deallocate memory) in separate functions and separate files?

I am still looking forward to any solutions or suggestions online.

Best wishes,
Jiawen 

///////////Error:

[liu@fornax Test_offomp]$ ./axpy 

 

offload error: process on the device 0 was terminated by signal 11 (SIGSEGV)

 

 

////////////Code:

 

#include <stdio.h>
#include <offload.h>
#ifdef _OPENMP
#include <omp.h>
#endif

#define REAL float


///////////////////axpy

REAL *x;
REAL *y;
//
static void init(REAL *A, long n);

void f(REAL *x, REAL *y, int nn, int s, int a){

int n,i;
#pragma offload target(mic:0) nocopy (x: length(s) alloc_if(0) free_if(0)) \
                                nocopy (y: length(s) alloc_if(0) free_if(0)) 
#pragma omp parallel for
    for(n=0; n<nn; n++)
    {
    for (i=0; i<s; i++)
        {
            y = x * a + y;
        }
    }
}

void axpy()
{
    int NTIMES 2000 
    int SIZE 500000
    int FACTOR 1
    int i,n;
    int array_sent = 0;
    init(x, SIZE);
    init(y, SIZE);
    REAL c = y[0];
    double start_timer = omp_get_wtime();

    double alloc_time = omp_get_wtime();
    #pragma offload target(mic:0) nocopy (x: length(SIZE) alloc_if(1) free_if(0)) \
                               nocopy (y: length(SIZE) alloc_if(1) free_if(0)) 
    {
    }
    alloc_time = omp_get_wtime() - alloc_time;

    double copy_time = omp_get_wtime();
    #pragma offload target(mic:0) in (x: length(SIZE) alloc_if(0) free_if(0)) \
                                in (y: length(SIZE) alloc_if(0) free_if(0)) 
    {
    }
    copy_time = omp_get_wtime() - copy_time;

    double kernel_time = omp_get_wtime();
    /*#pragma offload target(mic:0) nocopy (x: length(SIZE) alloc_if(0) free_if(0)) \
                                nocopy (y: length(SIZE) alloc_if(0) free_if(0)) 
    //#pragma omp parallel for
    for(n=0; n<NTIMES; n++)
    {
    for (i=0; i<SIZE; i++)
        {
            y = x * FACTOR + y;
        }
    }
        */

    f(x,y,NTIMES,SIZE,FACTOR);

    kernel_time = omp_get_wtime() - kernel_time;
    double free_time = omp_get_wtime();
    #pragma offload target(mic:0) nocopy (x:  alloc_if(0) free_if(1)) \
                                out (y: length(SIZE) alloc_if(0) free_if(1)) 
    {
    }
    free_time = omp_get_wtime() - free_time;

 

    double walltime = omp_get_wtime() - start_timer;
    if(c != y[0]) printf("Copy back to host successfully!\n\n");
    printf("PASS axpy\n\n");
    printf("Alloc time = %.8f sec\n\n", alloc_time);
    printf("Copy time = %.8f sec\n\n", copy_time);
    printf("Kernel time = %.8f sec\n\n", kernel_time);
    printf("Free time = %.8f sec\n\n", free_time);
    printf("Total time = %.8f sec\n\n", walltime);
}

static void init(REAL *A, long n) {
    long i;
    for (i = 0; i < n; i++) {
        A = ((REAL) (drand48()) + 13);
    }
}

int main(void)

{

    check_devices();
    x = (REAL*)malloc(sizeof(REAL)*SIZE);
    y = (REAL*)malloc(sizeof(REAL)*SIZE);
    axpy();
    free(x);
    free(y);
}

0 Kudos
3 Replies
Ravi_N_Intel
Employee
590 Views

for the following offload you need to send the pointer to the previously allocated memory.  So instead of using nocopy use in with length(0)

change :
#pragma offload target(mic:0) nocopy (x: length(s) alloc_if(0) free_if(0)) \
                                nocopy (y: length(s) alloc_if(0) free_if(0)) 

to :
#pragma offload target(mic:0) in(x: length(0) alloc_if(0) free_if(0)) \
                              in(y: length(0) alloc_if(0) free_if(0))

 

0 Kudos
Paulius_V_1
Beginner
590 Views

Ravi, I posted a very similar question earlier but for openmp for offload. right now I have alloc if(current iteration =0). it works fine for the first iteration but after that it segfaults. Basically the same issue as here. Could you tell me the equivalent for openMP? Thanks. 

0 Kudos
Ravi_N_Intel
Employee
590 Views

I need to see your code before I can tell you what you need to do.  OpenMP offload syntax and semantics are different.

0 Kudos
Reply