- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hi
========================================================== void tempA( ...) {...}; void tempB( ...) {...}; void processing(global int *a){ if(a == 0) tempA( a ); else tempB( a ); } __attribute__((num_simd_work_items(2))) __attribute__((reqd_work_group_size(256,1,1))) kernel void test (__global int * a ) // NDR , globalsize = a /2 , initial a[ 0~N ] = 1 { int gid = get_gloabla_gid(0); for(int i = 0 ; i < 2 ; i++){ while(a[gid + i] == 0) processing(&a[gid + i]); } } =========================================================== The code I wrote above is the thing I was trying . It showed that "Compiler Warning: Kernel Vectorization: branching is thread ID dependent ... cannot vectorize." How to solve or explain this situation ? And while loop with unpredicted end condition is not friendly for vectorization and very inefficent , right ? Thanks.Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It means that one of your branches is thread ID dependent. So the follow section
while(a == 0)
processing(&a);
is thread-id dependent. Best practices guide states to avoid work-item dependent backwards branching.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks okebz ,
So , if my write as follows , is it the same things ? =========================================== void tempA( ...) {...}; void tempB( ...) {...}; void processing(global int *a , int *b){ if(a == 0) tempA( a ,b); else tempB( a ,b); } __attribute__((num_simd_work_items(2))) __attribute__((reqd_work_group_size(256,1,1))) kernel void test (__global int * a ) // NDR { int gid = get_gloabla_gid(0); int b ; while ( b ==0 ) processing(&a[gid] , &b ); } ================================= But if my program flow is as previously said , how to optimize this code ? Each workitem stays in while loop until condition is matched. Is it better to use task instead of NDR ? Regards .,- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As long as b is not dependent on the work-item ID. Yes, depending on what you're trying to do, it seems like a single task would be better. If your problem data set cannot be divided into independent sections and depends on other work items, then a single work-item kernel might be a good choice.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page