I was compiling the same kernel code by using both v18.1 std and pro. For the standard version, I could achieve a fmax around 220 MHz, but for pro version, the fmax is only 190 MHz.
I further compared the report.html, and I found that loop dependence is found in the pro version report, but in the standard version everything is OK.
This does not happen when I was using v18.0 and older versions. What have changed in v18.1 ?
Is there any specific example you are using for comparison ?
If the example is from OpenCL examples provided by Intel, I can try it out on my end.
Sorry, I can not post the whole kernel code here, too many lines. Here's some results that can be seen directly:
For 18.1 std, the following code are succefully pipelined with II=1, with no warning:
But for 18.1 Pro, a fmax warning is shown in report.html as can be seen here:
dependency is found on variable find_idle_ch_id here:
For this reason, a fmax=190MHz is generated for a10 device, while for s5 device the fmax is 220MHz.
For my understanding, a10 is more a advanced device than s5, and should run higher frequence than stratix-v.
Keep in mind that S5 was the "top of the line" FPGA not so long ago. I think the reason for Pro version existence are different Gen10 devices with diffent metal routing and I/O columns in the middle of the die. Routing across I/O columns and around congested areas typically the main reasons for extra tPD and lower frequency in A10 vs S5.
Another odd thing is that sometimes 18.1 Pro generates unreasonable registers for private variables as follow:
The variable table_p2s_prefechtor is actually 16-bit width (unsigned short), but the compiler make it 512-bit wide, this makes feedback logics in-efficient.
For 18.1 std version, there is no problem: