Software Archive
Read-only legacy content
17060 Discussions

Cilk offloading

Abhishek_S_7
Beginner
960 Views

Hi,

I am new to cilk and trying to write a test code to offload cilk section. The code is as follows:

[cpp]

#include <stdio.h>
#include <stdlib.h>
#include <cilk/cilk.h>
#define ARRAY_SIZE 10

typedef struct test{
int x;
float *v;
} TEST;

void _Cilk_shared InitTest(TEST *t, int n);
TEST *_Cilk_shared t;

void _Cilk_shared InitTest(TEST *t, int n)
{
int i,j;

for(j=0; j<n; j++)
t->v = n*j;
}

int main(int argc, char* argv[])
{
int i, j, k;

i = ARRAY_SIZE;
if(argc == 2)
sscanf(argv[1],"%d",&i);

t = (TEST *_Cilk_shared)_Offload_shared_malloc(i*sizeof(TEST));

for(j=0; j<i; j++)
t.v =(float*)calloc(i,sizeof(float));

_Cilk_offload _Cilk_for (j=0; j<i; j++)
{
InitTest(&t, i);
}

for(j=0; j<i; j++)
{
for(k=0; k<i; k++)
printf("\t %f",t.v);
}

return 0;
}

[/cpp]

But while compiling it is giving following error :

cilkShared.c(37): error: pointer reference within _Cilk_offload _Cilk_for loop is not pointer-to-shared
InitTest(&t, i);
^

compilation aborted for cilkShared_ask.c (code 2)

Kindly help me to resolve the issue.

-Abhishek

0 Kudos
18 Replies
Ravi_N_Intel
Employee
960 Views

Here is some changes to your code

#include <stdio.h>
#include <stdlib.h>
#include <cilk/cilk.h>
#define ARRAY_SIZE 10
typedef struct test{
 int x;
 float *v;
 } TEST;
void _Cilk_shared InitTest(TEST *t, int n);
_Cilk_shared TEST *_Cilk_shared t;
void _Cilk_shared InitTest(TEST *t, int n)
{
 int i,j;
for(j=0; j<n; j++)
 t->v = n*j;
}
int main(int argc, char* argv[])
{
 int i, j, k;
i = ARRAY_SIZE;
 if(argc == 2)
 sscanf(argv[1],"%d",&i);
t = (_Cilk_shared TEST *_Cilk_shared)_Offload_shared_malloc(i*sizeof(TEST));
for(j=0; j<i; j++)
 t.v = (float *)_Offload_shared_malloc(i*sizeof(float));
_Cilk_offload _Cilk_for (j=0; j<i; j++)
 {
 InitTest(&t, i);
 }
for(j=0; j<i; j++)
 {
 for(k=0; k<i; k++)
 printf("t %f",t.v);
 }
return 0;
}

0 Kudos
Abhishek_S_7
Beginner
960 Views

Thanks Ravi

-Abhishek

0 Kudos
mohamad_a_
Beginner
960 Views

Dear All,

I am new to cilk,

I ran the last code on my node and got the following error:

main.cpp(10): error: variable t may not be marked _Cilk_shared and have "target" attribute
  _Cilk_shared TEST *_Cilk_shared t;
                                  ^

main.cpp(11): error: routine InitTest may not be marked _Cilk_shared and have "target" attribute
  void _Cilk_shared InitTest(TEST *t, int n)

Would you please tell me what's wrong?

Best regards.

0 Kudos
TaylorIoTKidd
New Contributor I
960 Views

Please post the parts of your code that are relevant so that we can see the context of the errors instead of just giving you guesses.

Regards
--
Taylor
 

0 Kudos
mohamad_a_
Beginner
960 Views

Dear Taylor,

I used the same code as Ravi.

I ran the code with "icpc version 14.0.0":

All the best

0 Kudos
Kevin_D_Intel
Employee
960 Views

The code Ravi posted compiles successfully for me with all 14.0 compilers released to date and our latest 15.0 compiler.

Please double check that your code matches Ravi's and if so then please show the actual compiler command line you used along with the output of the command: icpc -V

0 Kudos
mohamad_a_
Beginner
960 Views

Thanks Kevin,

I checked the code again. It was the same.

The icpc -V:
Intel(R) C++ Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 14.0.0.080 Build 20130728
Copyright (C) 1985-2013 Intel Corporation.  All rights reserved.

The command line:

icpc -c -g  -I/opt/intel/composer_xe_2013_sp1.0.080/mkl/include -offload-attribute-target=mic  -O3 -vec-report3 -restrict -fno-alias -fargument-noalias -ansi-alias -opt-report-phase=offload  -openmp  main.cpp

All the best

0 Kudos
Kevin_D_Intel
Employee
960 Views

Ok. The error occurs due to your applying the target attribute over the top of the _Cilk_shared attribute (and all variables in the source file) via the command line with the option: -offload-attribute-target=mic

That option is not appropriate for the code Ravi posted.

0 Kudos
mohamad_a_
Beginner
960 Views

Thanks Kevin,

It works without the option.

I have a question: Can we have a report here for "cilk", something like "offload report"?

0 Kudos
Kevin_D_Intel
Employee
960 Views

The OFFLOAD_REPORT environment variable provides run-time details for _CIlk_offload/_Cilk_offload_to. The option -opt-report-phase=offload provides limited info on _Cilk_offload/_Cilk_shared directives.

What specific information in terms of a report are you interested in?

0 Kudos
mohamad_a_
Beginner
960 Views

Thanks Kevin,

Running the above code, I have the "export OFFLOAD_REPORT=2" on my Makefile too.

Shouldn't I expect an offload report because of the following part of the code?

_Cilk_offload _Cilk_for (j=0; j<i; j++)
 {
 InitTest(&t, i);
 }

But I don't have any offload report here!

0 Kudos
Kevin_D_Intel
Employee
960 Views

Sorry, only OFFLOAD_REPORT=3 returns details for the virtual shared model. It appears we overlooked that detail in our User's Guide so we will correct that.

0 Kudos
mohamad_a_
Beginner
960 Views

Great,

Thanks a lot Kevin.

Now I have the following from OFFLOAD_REPORT=3:

[Offload] [HOST]  [State]   Initialize logical card 0 = physical card 0
[Offload] [MIC 0] [State]   MIC MYO shared table register t_41db1a6c422b8b58b0bd6ddde933a47f_myo_ptr
[Offload] [MIC 0] [State]   MIC MYO fptr table register op_fncall_b4e068a4da6dbef9373d8285e48e052a
[Offload] [MIC 0] [State]   MIC MYO fptr table register InitTest_1fd31af90aca81305f8009f418bb567d
[Offload] [HOST]  [State]   Initialize MYO
[Offload] [HOST]  [State]   MYO shared aligned malloc 8 8
[Offload] [HOST]  [State]   Register MYO tables
[Offload] [HOST]  [State]   MYO shared malloc 160
[Offload] [HOST]  [State]   MYO shared malloc 40
[Offload] [HOST]  [State]   MYO shared malloc 40
[Offload] [HOST]  [State]   MYO shared malloc 40
[Offload] [HOST]  [State]   MYO shared malloc 40
[Offload] [HOST]  [State]   MYO shared malloc 40
[Offload] [HOST]  [State]   MYO shared malloc 40
[Offload] [HOST]  [State]   MYO shared malloc 40
[Offload] [HOST]  [State]   MYO shared malloc 40
[Offload] [HOST]  [State]   MYO shared malloc 40
[Offload] [HOST]  [State]   MYO shared malloc 40
[Offload] [HOST]  [State]   MYO shared malloc 24
[Offload] [HOST]  [State]   MYO release
[Offload] [HOST]  [State]   MYO acquire
[Offload] [HOST]  [State]   MYO shared free
[Offload] [HOST]  [State]   Finalize MYO
[Offload] [MIC 0] [State]   Unregister data tables
[Offload] [HOST]  [State]   Unregister data tables

I am sorry, can we know even more, e.g. synchronization timing?

0 Kudos
mohamad_a_
Beginner
960 Views

Any other information for virtual shared model except from

the option -opt-report-phase=offload or setting the export OFFLOAD_REPORT=3 ?

0 Kudos
Kevin_D_Intel
Employee
960 Views

No.
 

0 Kudos
mohamad_a_
Beginner
960 Views

Thanks

All the best

0 Kudos
mohamad_a_
Beginner
960 Views

Dear All,

I am a little confused undrestanding the above report from OFFLOAD_REPORT=3 completely,

especially for these two lines:

[Offload] [HOST]  [State]   MYO shared aligned malloc 8 8
[Offload] [HOST]  [State]   MYO shared malloc 24

Would you please help me out?

0 Kudos
Kevin_D_Intel
Employee
960 Views

Here's an annotation of the report from the 15.0 compiler for the source shown below.

#include <stdio.h>
#include <stdlib.h>
#include <cilk/cilk.h>
#define ARRAY_SIZE 10
typedef struct test{
 int x;
 float *v;
 } TEST;
void _Cilk_shared InitTest(TEST *t, int n);
_Cilk_shared TEST *_Cilk_shared t;
void _Cilk_shared InitTest(TEST *t, int n)
{
 int i,j;
for(j=0; j<n; j++)
 t->v = n*j;
}
int main(int argc, char* argv[])
{
 int i, j, k;
i = ARRAY_SIZE;
 if(argc == 2)
 sscanf(argv[1],"%d",&i);
t = (_Cilk_shared TEST *_Cilk_shared)_Offload_shared_malloc(i*sizeof(TEST));
for(j=0; j<i; j++)
 t.v = (float *)_Offload_shared_malloc(i*sizeof(float));
_Cilk_offload _Cilk_for (j=0; j<i; j++)
 {
 InitTest(&t, i);
 }
fflush(0);
for(j=0; j<i; j++)
 {
 for(k=0; k<i; k++)
 printf("t %f",t.v);
 }
return 0;
}

Some data table entries displayed by the 14.0 compiler/run-time that were not generally useful were removed in the current release.

$ icpc -V
Intel(R) C++ Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 15.0.0.090 Build 20140723
Copyright (C) 1985-2014 Intel Corporation.  All rights reserved.

$ icpc u401262.cpp
$ export OFFLOAD_REPORT=3
$ ./a.out

// Allocation associated with source line 10
// 10:    _Cilk_shared TEST *_Cilk_shared t;

[Offload] [HOST]          [State]           MYO shared aligned malloc 8 8

// Internal MYO tables that support virtual shared memory

[Offload] [HOST]          [State]           Register MYO tables

// Corresponds to malloc on line 23
//   23:  t = (_Cilk_shared TEST *_Cilk_shared)_Offload_shared_malloc(i*sizeof(TEST));

[Offload] [HOST]          [State]           MYO shared malloc 160

// The next malloc series corresponds to the for loop at lines 24-25 including the shared_malloc 

[Offload] [HOST]          [State]           MYO shared malloc 40
[Offload] [HOST]          [State]           MYO shared malloc 40
[Offload] [HOST]          [State]           MYO shared malloc 40
[Offload] [HOST]          [State]           MYO shared malloc 40
[Offload] [HOST]          [State]           MYO shared malloc 40
[Offload] [HOST]          [State]           MYO shared malloc 40
[Offload] [HOST]          [State]           MYO shared malloc 40
[Offload] [HOST]          [State]           MYO shared malloc 40
[Offload] [HOST]          [State]           MYO shared malloc 40
[Offload] [HOST]          [State]           MYO shared malloc 40

// Acquire the coprocessor and initialize for use

[Offload] [HOST]          [State]           Initialize logical card 0 = physical card 0

// Initialize Virtual shared run-time - shared memory allocation and other
// bookkeeping for shared memory management

[Offload] [HOST]          [State]           Initialize MYO

// Internal allocation for passing information for Cilk_shared Cilk_for execution on the coprocessor

[Offload] [HOST]          [State]           MYO shared malloc 24

// At every offload, host notifies coprocessor shared memory can be synced up TO the coprocessor (MYO release) 
// and after the offload the host memory is synced up FROM the coprocessor (MYO acquire)

[Offload] [HOST]          [State]           MYO release
[Offload] [HOST]          [State]           MYO acquire

// Free previous internal allocation (malloc 24)

[Offload] [HOST]          [State]           MYO shared free

// User program output per for loop at lines 32-34

t 0.000000t 10.000000t 20.000000t 30.000000t 40.000000t 50.000000t 60.000000t 70.000000t 80.000000t 90.000000t 0.000000t 10.000000
t 20.000000t 30.000000t 40.000000t 50.000000t 60.000000t 70.000000t 80.000000t 90.000000t 0.000000t 10.000000t 20.000000t 30.000000
t 40.000000t 50.000000t 60.000000t 70.000000t 80.000000t 90.000000t 0.000000t 10.000000t 20.000000t 30.000000t 40.000000t 50.000000
t 60.000000t 70.000000t 80.000000t 90.000000t 0.000000t 10.000000t 20.000000t 30.000000t 40.000000t 50.000000t 60.000000t 70.000000
t 80.000000t 90.000000t 0.000000t 10.000000t 20.000000t 30.000000t 40.000000t 50.000000t 60.000000t 70.000000t 80.000000t 90.000000
t 0.000000t 10.000000t 20.000000t 30.000000t 40.000000t 50.000000t 60.000000t 70.000000t 80.000000t 90.000000t 0.000000t 10.000000
t 20.000000t 30.000000t 40.000000t 50.000000t 60.000000t 70.000000t 80.000000t 90.000000t 0.000000t 10.000000t 20.000000t 30.000000
t 40.000000t 50.000000t 60.000000t 70.000000t 80.000000t 90.000000t 0.000000t 10.000000t 20.000000t 30.000000t 40.000000t 50.000000
t 60.000000t 70.000000t 80.000000t 90.000000

// Finalize MYO execution with host program termination

[Offload] [HOST]          [State]           Finalize MYO

 

0 Kudos
Reply