- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello! I have been playing around more with offloading for the GFX, but there is something weird going on. Lets say I have this code:
__declspec(target(gfx)) void function(double *vector) {
#pragma parallel_loop
//do something with the vector in a parallel loop
}
Now when I try to compile the code using: icl test.c /Qoffload /Qstd=c99 /Qopenmp i get the following error:
a statement with parallel loop pragma must also have an offload target(gfx) pragma before the for loop in the function
What is going on. Some time again I have managed to compile the code without any issues.
Thank you
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As the error said "an offload target(gfx) pragma before the for loop ",i suppose you should add that explicit pragma too.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi! Thanks for the answer, but no it is not correct. You cannot put a offload stuff in a function that already is declared as __declspec(target(gfx)). I have tried that thing. It does not work.
Although I found the solution. Intel Compiler updated itself with service pack 1. That update was the problem. I reinstalled the whole compiler and after it worked.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry, I responded too quickly. The compiler is indeed correct. I'm told by the language designer that #pragma offload target(gfx) is indeed required to mark the parallel loop as an "offload region". The declspec does not do that. It merely states that the function should be compiled to run on the GFX target.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry, you've hit a bug in the current compiler. It will be fixed in the next update.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am agreeing with you that #pragma offload is needed, but there is a slight difference in what I want.
Lets say I have a main function:
int main(void) {
double *vector;
....
#pragma offload target(gfx) if(do_offload) \
pin(vector : length(SIZE))
#pragma parallel_loop
for(int i = 0; i < 10; i++) function(vector);
...
}
Then, in the function I do not need to put #pragma offload, because I already have it. I just needed it to be declared as compileable for GFX, that's all. I must apologize that I wasn't more specific.
I have a main function that does the offloading part and calls a function "function". Therefore the function must have __declspec(target(gfx)), making it compileable for the GPU, but inside the function I do not need to add another #pragma offload target(gfx). I have tried this and it complaints about it, because I am aleady doing the offloading in the main function. I just need function invokations to functions that are GPU compatible. As I said in a previous paragraph, i have installed the compiler once again and now it works without adding the #pragma offload inside the function.
Thanks,
Thom
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To say a bit more....
If you want to call the function from an offload region then it needs to be declared with the target(gfx) attribute but not have the #pragma parallel loop in the body, e.g.
__declspec(target(gfx)) void function(double *vector) {
// do something with the vector in a parallel loop
}
...
#pragma offload ..
#pragma parallel_loop
for (...) {
...
function(vec);
}
If you want the body of the function to be the offload region then it is
void function(double *vector) {
#pragma offload
#pragma parallel_loop
// do something with the vector in a parallel loop
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OK! Now it is weird. When I have tried it without #pragma parallel_loop it did not work. Too bad I reinstalled the compiler, I should have taken some printscreens and sent them to you. Oh well, now it works with and without. But before reinstalling the compiler it gave me some errors related to environment settings and MIC incompatibilities. I promise next time it happens I will send you the screen pictures.
I have also attached the code snippets that show you can add #pragma parallel_loop inside the code and compile it.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page