cl-fast-relaxed-math and profiling tools

Intel® Quartus® Prime Software

Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)

cl-fast-relaxed-math and profiling tools

1,189 Views

Hi,

There are two questions:

First :

In OpenCL standard it provides the cl-fast-relaxed-math to speed up and could lack of accuracy.

I test the OpenCL code with this flag on INTEL,NIVIDA and AMD platforms.

It could gain a speedup ~1x.

But I use the AOCL compiler to add cl-fast-relaxed-math while compiling the OpenCL kernel Code.

It seems that it could not gain any performance. Is the AOCL library doesn't support this flag now ?

Second :

I write a OpenCL program and the program might execute EnqueueNDRange API many time(use the for loop to enqueue repeatedly). The host only executes API and READ/WRITE buffer. Although from host executes EnqueueNDRange and READ/WRITE buffer to the FPGA receive the API signal to execute kernel code will waste 10~100ms overhead. Because there is no profiling tool to profile the detail situation. Therefore could any one help this problem ?

SDK : 14.1

platform : DE5

Thanks

Link Copied

0 Replies

Community support is provided Monday to Friday. Other contact methods are available here.

Intel does not verify all solutions, including but not limited to any file transfers that may appear in this community. Accordingly, Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

For more complete information about compiler optimizations, see our Optimization Notice.