- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello all,
I am in the process of optimizing an OpenCL kernel which performs a very simple task (the core part of the kernel is about 10 lines long and does not contain any complex branching or such). As I try to use the "num_simd_work_items" attribute, aoc outputs the following warning : --- Quote Start --- Compiler Warning: Vectorized kernel contains loads/stores that cannot be vectorized. This might reduce performance. --- Quote End --- The result I get is higher resource usage and lower throughput limited by global memory. I'm trying to understand what causes the compiler not to be able to vectorize the loads/stores.- My code looks like this :
#ifdef ALTERA_CL
__attribute__((num_simd_work_items(4)))
__attribute__((reqd_work_group_size(256,1,1)))# endif
__kernel void kernel( __global const uchar * restrict input, __global uchar * restrict output) {
unsigned int gx = get_global_id(0);
// Load to private mem
__private float tempIn = (float) input;
__private uchar tempOut;
// Do stuff
tempOut = f(tempIn);
// Store to global mem
output = tempOut;
}
I also tried to vectorize by hand (see code below), but while the application worked on CPU, I stumbled upon walls of error from llvm when invoking aoc. - Manually vectorized code, works on CPU :
#ifdef ALTERA_CL
__attribute__((reqd_work_group_size(256,1,1)))# endif
__kernel void melate( __global const uchar * restrict input, __global uchar * restrict output) {
unsigned int gx = get_global_id(0)*4;
// Private mem
__private float4 tempIn = convert_float4( vload4 (0, input+gx));
__private uchar4 tempOut;
// Do stuff
tempOut = f(tempIn);
// Store to global mem
vstore4(tempOut, 0, output+gx);
}
- The part I suspect is causing errors with llvm :
__private float4 tempVar;
__private float4 ot = (255, 255, 255, 255);
__private float4 ut = (0, 0, 0, 0);
tempVar = (tempIn > TRS) ? ot : ut;
Any help with these two problems will be greatly appreciated. :) koper
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It's hard to tell without seeing what's in auxilary function 'f'. With a fairly trivial implementation of function 'f' I don't see the warning using 13.1.3 (13.1 with update 3). It's possible that you might be using older tools and this a bug that has been fixed so I would double check the version number of the compiler by running aoc --version. 13.1.3 is build 178 and there is now an update 4 available on the Altera download site (see forum sticky for more info).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The 'f' function performs a simple binarization, with a few extra conditions around it.
It turns out the warning was due to a test on the "gx" variable I use for the work-items IDs, that is not included in the code I posted. I had this at the top of the kernel :if(gx < startID)
return;
After I managed to get rid of that test, the warning was gone too... However, you are right in that I don't use the latest build (aoc --version gave me build 162) ! Time to update :) and check if the llvm error still occurs. Thanks for your reply.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would try the upgrade first since when I was using update 3 I had a hard time creating a case that would cause that warning to occur. That early return would cause some of the vector lanes to have nothing to do so that's most likely why you saw the warning. The compiler might have been modified to let those flow through to the end of the kernel by just putting conditionals throughout the kernel behind the scenes which would allow the vectorization to remain intact and efficient so after you upgrade you'll be able to check to see if that's the case. If that's not the case then you would need to refactor the code a bit so that all the work-items exit the kernel in the same spot (at the end)

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page