Many works recently prefer FPGA over GPU in implementation of irregular parallel applications such as sparse Matrix multiplications and sparse convolutions.
Is there any popular library for FPGA wrt sparse arithematic? Any well known popular benchmarks?
Is it feasable to implement such irregular applications using OpenCL? Is it possible for a pipelined sparse arithematic architecture as the nested loops in OpenCL cannot infer pipelined execution of variable count or cannot execute memory dependent compute logic in parallel?