Solved: max_concurrency: scheduled at run-time or compile-time?

xwuupb · ‎11-11-2022

Hi everyone,

I have a question concerning [[intel::max_concurrency(n)]] and cannot find an answer in the reference guide.

Is the loop with this directive scheduled at run-time on FPGAs or compile-time by the compiler? Based on the optimization report, the resources are estimated or determined at compile-time, of course. But it's not very clear, whether the iterations in the loop are dynamically scheduled at run-time, or scheduled already at compile-time, or other possibilities?

Many thanks in advance!

Xin

yuguen · ‎11-30-2022

Hey @xwuupb

Changing the concurrency of the loop changes the hardware that gets produced. So this is a compile time change.

Reducing the concurrency of a loop reduces the resource usage for that loop (because it needs to be able to handle fewer concurrent iterations).

Now it is called "max_concurrecy" and not "concurrency" for a reason.

Your loop may now not have a new iteration ready to execute at each cycle.

This may be due to different reasons such as a blocking pipe read or any upstream component that stalls your loop.

So there is also a run-time aspect to how loop iterations are scheduled.

The reason why you can't find the exact answer you are looking for is because it would not make sense to artificially limit the concurrency of a loop at runtime.

If you have the hardware that allows you to interleave as many iterations as possible, there is no reason to limit it at run time.

So the entire point of this attribute is to tell the compiler that it can limit the maximum concurrency because you know that the consumer of that the result being computed in this loop does not require the result to be computed so fast.

Therefore, you can make hardware savings by limiting the max concurrency (compile time).

The page linked above by @hareesh says "The max_concurrency attribute enables you to control the on-chip memory resources required to pipeline your loop." which implicitly tells you this is done at compile time.

View solution in original post

SyafieqS · ‎11-14-2022

Hi Xin,

Seem this is related to oneAPI, I am transferring this to right owner expertise and you will expect a reply soon.

hareesh · ‎11-17-2022

Hi,

please go through bellow document 4.6.5

https://www.intel.com/content/www/us/en/develop/documentation/oneapi-fpga-optimization-guide/top.html?wapkw=max_concurrency(n)

xwuupb · ‎11-17-2022

Hi, your answer is not helpful at all. Because neither run-time nor compile-time is mentioned in this document regarding the schedule of loop iterations using [[intel::max_concurrency(n)]].

hareesh · ‎11-21-2022

Hi,

if you don't mind can you please tell me exactly what type of issue are you facing and what information do you needed?

xwuupb · ‎11-21-2022

Hi, the information I need (but not found in relevant oneAPI manuals) is clearly written in the questions in the first post. To repeat the two questions:

Is the loop with this directive scheduled at run-time on FPGAs or compile-time by the compiler?
But it's not very clear, whether the iterations in the loop are dynamically scheduled at run-time, or scheduled already at compile-time, or other possibilities?

hareesh · ‎11-29-2022

Hi,

1) Is the loop with this directive scheduled at run-time on FPGAs or compile-time by the compiler?

Answer: max_concurrency attribute are implemented to pipeline the loop for singel task kernel, hence my guess it would be programme to execute in the hardware runtime to provide an estimation of area used.

2) But it's not very clear, whether the iterations in the loop are dynamically scheduled at run-time, or scheduled already at compile-time, or other possibilities?

Answer: as mention above.

More details explaining the functionality of the attributes can be found in the link below:

- https://www.intel.com/content/www/us/en/develop/documentation/oneapi-fpga-optimization-guide/top/flags-attr-prag-ext/loop-directives/max-concurrency-attribute.html

Hope that clarify.

Thanks,

xwuupb · ‎11-29-2022

Hi, thanks for your answers.

> my guess it would be programme to execute in the hardware runtime to provide an estimation of area used.

Do you have additional references or manuals (besides the link you give below) for it? "my guess ... " sounds not solid.

> More details explaining the functionality of the attributes can be found in the link below:

I have read the link given in your answer. However, it is not mentioned, whether the loop with max_concurrency is scheduled at run-time or compile-time.

yuguen · ‎11-30-2022