Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16638 Discussions

Loop induction variable decrementing instead of incrementing?

Altera_Forum
Honored Contributor II
1,059 Views

I've been working with an implementation of the Needleman-Wunsch algorithm for global sequence alignment. It is very similar to Smith-Waterman in that it has a very natural representation as a systolic array. I'm trying to build a version of the kernel so that the compiler recognizes this feature, however I'm running into a strange issue where the inner loop decrements instead of increments. I've tested my code against a few GPUs and CPUs and they all are correct. 

 

Here is the kernel where I've hard-coded the problem/block size to be 16x16 for debugging purposes: 

 

# define BLOCK_SIZE 16 __kernel __attribute__((task)) void nw_kernel1( __global int * restrict input_itemsets_d, __global int * restrict output_itemsets_d, __global int * restrict seq_1, __global int * restrict seq_2, int penalty ) { __private int Sd_private; __private int Sh_private; __private int Sv_private; //prime first row with values Sh_private = 0; //INIT ROW VERTICAL INPUTS # pragma unroll for(int i = 0; i < BLOCK_SIZE; i++){ Sv_private = SCORE_GLOBAL(0, (i + 1)); } Sd_private = 0; //for each row for( int row = 0 ; row < BLOCK_SIZE; row++){ printf("ROW = %d\n", row); //INITIALIZE INPUT // prime Sh location for row Sh_private = SCORE_GLOBAL((row + 1), 0); int score_y = seq_2; //for each column in row # pragma unroll BLOCK_SIZE for(int col = 0; col < BLOCK_SIZE ; col++){ printf("COL = %d\n", col); int score_x = seq_1; int ref = reference_l; int Sd = Sd_private; int Sh = Sh_private; int Sv = Sv_private; //COMPUTE // Assign score based on other values int tmp = maximum((Sd + ref), Sh - penalty, Sv - penalty); //store to global memory SCORE_GLOBAL_O((row + 1), (col + 1)) = tmp; if(col != 15) { //SHIFT // Store to private arrays for next iteration // 1) store Sd values for next row Sd_private = tmp; // 2) shift register to pass Sv values to Sd of next column Sd_private = Sv_private; // pass Sh values to next column Sh_private = tmp; //pass Sv values to current column, next row Sv_private = tmp; } } //SHIFT // for starting column, last Sh value can be new Sd value Sd_private = Sh_private; //save all tmp Sd values for next loop iteration # pragma unroll for(int i = 1; i < BLOCK_SIZE; i++){ Sd_private = Sd_private; } } return; }  

 

What gets printed out (which also explains the incorrect output) is the following 

 

ROW = 0 

COL = 0 

COL = 15 

COL = 14 

etc... 

 

Is there anything obvious I'm doing wrong? Or is this most likely a compiler bug? 

 

PM me if you would like access to the source code. 

 

-Jack
0 Kudos
4 Replies
Altera_Forum
Honored Contributor II
346 Views

Are you sure that console output goes with that kernel? For example you have this which shouldn't have compiled: printf("****COL = %d\n");

0 Kudos
Altera_Forum
Honored Contributor II
346 Views

Yeah, sorry that was just a sloppy copy paste error. I've adjusted the kernel code above to reflect the actual code that prints the column order as 0,15,14, etc...

0 Kudos
Altera_Forum
Honored Contributor II
346 Views

There are two issues here: 1) Functional error in the code, which was a bug in the compiler; it will be fixed in the next release. 2) Printf instructions printing out-of-order. Currently, this is expected. There is no ordering guarantees between different printf calls. In this case, because the loop is unrolled, the body contains multiple printfs (one for each iteration), and these printfs can print in different order, not in iteration order.

0 Kudos
Altera_Forum
Honored Contributor II
346 Views

Outku, 

 

Thanks for the reply. This all makes sense based on the output I'm seeing. 

 

- Jack
0 Kudos
Reply