Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
Announcements
All support for Intel NUC 7 - 13 systems has transitioned to ASUS. Read latest update.
16488 Discussions

Fully pipelined sparse matrix-vector multiplication(SMVM)

KAkyo
Beginner
718 Views

Dear all,

 

I am trying to implement an application as efficient as possible as a single work-item kernel. And I found out that, my application is very same with SMVM. In this application we have double for loop, outer loop iterates rowCount times, and inner loop iterates #ofNonzeroElementsInRow times. However in this structure, compiler cannot pipeline the structure because of "Out-of-Order Loop Iterations" below:

The kernel is compiled for single work-item execution.   Loop Report:   + Loop "Block1" (file compute_pagerank_single.cl line 34) | NOT pipelined due to: | Loop exit condition unresolvable at iteration initiation. | Simplify loop exit condition to fix this problem. | See "Unable to Resolve Loop Exit Condition at Iteration Initiation" section of the Best Practices Guide for more information. | Not pipelining this loop will most likely lead to poor performance. | | |-+ Loop "Block2" (file compute_pagerank_single.cl line 43) Pipelined well. Successive iterations are launched every cycle.

I searched on the forum for such problems and I found this question. In this article, It is said that, using all the elements with a condition during the iteration of the outer loop to make number of iterations constant. However this yields huge performance loss because of empty cycles.

 

I thought that even some applications are not easy to solve, well-known application like SMVM should be implemented in the most efficient way.

 

I couldn't find any pointer to this problem and implementation of SMVM on the internet. My question is, is there any "most-efficient" implementation of this application? Or can "completely pipelining a loop structure with variable number of iteration" be done with some trick or so?

 

Regards,

Thank you in advance,

Kaan Akyol

0 Kudos
0 Replies
Reply