- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've been working with an implementation of the Needleman-Wunsch algorithm for global sequence alignment. It is very similar to Smith-Waterman in that it has a very natural representation as a systolic array. I'm trying to build a version of the kernel so that the compiler recognizes this feature, however I'm running into a strange issue where the inner loop decrements instead of increments. I've tested my code against a few GPUs and CPUs and they all are correct.
Here is the kernel where I've hard-coded the problem/block size to be 16x16 for debugging purposes:# define BLOCK_SIZE 16
__kernel
__attribute__((task))
void nw_kernel1(
__global int * restrict input_itemsets_d,
__global int * restrict output_itemsets_d,
__global int * restrict seq_1,
__global int * restrict seq_2,
int penalty
)
{
__private int Sd_private;
__private int Sh_private;
__private int Sv_private;
//prime first row with values
Sh_private = 0;
//INIT ROW VERTICAL INPUTS
# pragma unroll
for(int i = 0; i < BLOCK_SIZE; i++){
Sv_private = SCORE_GLOBAL(0, (i + 1));
}
Sd_private = 0;
//for each row
for( int row = 0 ; row < BLOCK_SIZE; row++){
printf("ROW = %d\n", row);
//INITIALIZE INPUT
// prime Sh location for row
Sh_private = SCORE_GLOBAL((row + 1), 0);
int score_y = seq_2;
//for each column in row
# pragma unroll BLOCK_SIZE
for(int col = 0; col < BLOCK_SIZE ; col++){
printf("COL = %d\n", col);
int score_x = seq_1;
int ref = reference_l;
int Sd = Sd_private;
int Sh = Sh_private;
int Sv = Sv_private;
//COMPUTE
// Assign score based on other values
int tmp = maximum((Sd + ref),
Sh - penalty,
Sv - penalty);
//store to global memory
SCORE_GLOBAL_O((row + 1), (col + 1)) = tmp;
if(col != 15) {
//SHIFT
// Store to private arrays for next iteration
// 1) store Sd values for next row
Sd_private = tmp;
// 2) shift register to pass Sv values to Sd of next column
Sd_private = Sv_private;
// pass Sh values to next column
Sh_private = tmp;
//pass Sv values to current column, next row
Sv_private = tmp;
}
}
//SHIFT
// for starting column, last Sh value can be new Sd value
Sd_private = Sh_private;
//save all tmp Sd values for next loop iteration
# pragma unroll
for(int i = 1; i < BLOCK_SIZE; i++){
Sd_private = Sd_private;
}
}
return;
}
What gets printed out (which also explains the incorrect output) is the following ROW = 0 COL = 0 COL = 15 COL = 14 etc... Is there anything obvious I'm doing wrong? Or is this most likely a compiler bug? PM me if you would like access to the source code. -Jack
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are you sure that console output goes with that kernel? For example you have this which shouldn't have compiled: printf("****COL = %d\n");
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yeah, sorry that was just a sloppy copy paste error. I've adjusted the kernel code above to reflect the actual code that prints the column order as 0,15,14, etc...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There are two issues here: 1) Functional error in the code, which was a bug in the compiler; it will be fixed in the next release. 2) Printf instructions printing out-of-order. Currently, this is expected. There is no ordering guarantees between different printf calls. In this case, because the loop is unrolled, the body contains multiple printfs (one for each iteration), and these printfs can print in different order, not in iteration order.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Outku,
Thanks for the reply. This all makes sense based on the output I'm seeing. - Jack- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page