Intel® High Level Design
Support for Intel® High Level Synthesis Compiler, DSP Builder, OneAPI for Intel® FPGAs, Intel® FPGA SDK for OpenCL™
Announcements
The Intel sign-in experience is changing in February to support enhanced security controls. If you sign in, click here for more information.
573 Discussions

max_concurrency: scheduled at run-time or compile-time?

xwuupb
Novice
509 Views

Hi everyone,

 

I have a question concerning [[intel::max_concurrency(n)]] and cannot find an answer in the reference guide.

Is the loop with this directive scheduled at run-time on FPGAs or compile-time by the compiler? Based on the optimization report, the resources are estimated or determined at compile-time, of course. But it's not very clear, whether the iterations in the loop are dynamically scheduled at run-time, or scheduled already at compile-time, or other possibilities?

 

Many thanks in advance!

 

Xin

0 Kudos
1 Solution
yuguen
Employee
300 Views

Hey @xwuupb 

 

Changing the concurrency of the loop changes the hardware that gets produced. So this is a compile time change.

Reducing the concurrency of a loop reduces the resource usage for that loop (because it needs to be able to handle fewer concurrent iterations).

 

Now it is called "max_concurrecy" and not "concurrency" for a reason.

Your loop may now not have a new iteration ready to execute at each cycle.

This may be due to different reasons such as a blocking pipe read or any upstream component that stalls your loop.

So there is also a run-time aspect to how loop iterations are scheduled.

 

The reason why you can't find the exact answer you are looking for is because it would not make sense to artificially limit the concurrency of a loop at runtime.

If you have the hardware that allows you to interleave as many iterations as possible, there is no reason to limit it at run time.

So the entire point of this attribute is to tell the compiler that it can limit the maximum concurrency because you know that the consumer of that the result being computed in this loop does not require the result to be computed so fast.

Therefore, you can make hardware savings by limiting the max concurrency (compile time).

 

The page linked above by @hareesh says "The max_concurrency attribute enables you to control the on-chip memory resources required to pipeline your loop." which implicitly tells you this is done at compile time.

View solution in original post

13 Replies
SyafieqS
Moderator
460 Views

Hi Xin,


Seem this is related to oneAPI, I am transferring this to right owner expertise and you will expect a reply soon.


hareesh
Employee
440 Views
xwuupb
Novice
429 Views

Hi, your answer is not helpful at all. Because neither run-time nor compile-time is mentioned in this document regarding the schedule of loop iterations using [[intel::max_concurrency(n)]].

hareesh
Employee
405 Views

Hi,

if you don't mind can you please tell me exactly what type of issue are you facing and what information do you needed?


xwuupb
Novice
391 Views

Hi, the information I need (but not found in relevant oneAPI manuals) is clearly written in the questions in the first post. To repeat the two questions:

 

  1. Is the loop with this directive scheduled at run-time on FPGAs or compile-time by the compiler?
  2. But it's not very clear, whether the iterations in the loop are dynamically scheduled at run-time, or scheduled already at compile-time, or other possibilities?
hareesh
Employee
319 Views

Hi,

1) Is the loop with this directive scheduled at run-time on FPGAs or compile-time by the compiler?

Answer: max_concurrency attribute are implemented to pipeline the loop for singel task kernel, hence my guess it would be programme to execute in the hardware runtime to provide an estimation of area used.

2) But it's not very clear, whether the iterations in the loop are dynamically scheduled at run-time, or scheduled already at compile-time, or other possibilities?

Answer: as mention above.


More details explaining the functionality of the attributes can be found in the link below:

https://www.intel.com/content/www/us/en/develop/documentation/oneapi-fpga-optimization-guide/top/fla...

Hope that clarify.


Thanks,


xwuupb
Novice
313 Views

Hi, thanks for your answers.

> my guess it would be programme to execute in the hardware runtime to provide an estimation of area used.

Do you have additional references or manuals (besides the link you give below) for it? "my guess ... " sounds not solid.

> More details explaining the functionality of the attributes can be found in the link below:

I have read the link given in your answer. However, it is not mentioned, whether the loop with max_concurrency is scheduled at run-time or compile-time.

 

yuguen
Employee
301 Views

Hey @xwuupb 

 

Changing the concurrency of the loop changes the hardware that gets produced. So this is a compile time change.

Reducing the concurrency of a loop reduces the resource usage for that loop (because it needs to be able to handle fewer concurrent iterations).

 

Now it is called "max_concurrecy" and not "concurrency" for a reason.

Your loop may now not have a new iteration ready to execute at each cycle.

This may be due to different reasons such as a blocking pipe read or any upstream component that stalls your loop.

So there is also a run-time aspect to how loop iterations are scheduled.

 

The reason why you can't find the exact answer you are looking for is because it would not make sense to artificially limit the concurrency of a loop at runtime.

If you have the hardware that allows you to interleave as many iterations as possible, there is no reason to limit it at run time.

So the entire point of this attribute is to tell the compiler that it can limit the maximum concurrency because you know that the consumer of that the result being computed in this loop does not require the result to be computed so fast.

Therefore, you can make hardware savings by limiting the max concurrency (compile time).

 

The page linked above by @hareesh says "The max_concurrency attribute enables you to control the on-chip memory resources required to pipeline your loop." which implicitly tells you this is done at compile time.

xwuupb
Novice
292 Views

Thanks for your detailed explanation. You solved my question.

hareesh
Employee
285 Views

Hi,

I think you got solution if you don't have any queries about this issue i'll close this case. please conform .


Thanks,


hareesh
Employee
249 Views

Hi,

Still are you facing problem?


xwuupb
Novice
243 Views

the problem was solved. thanks.

hareesh
Employee
242 Views

If any answer from the Intel Support is helpful, please feel free to provide ratting with 9/10 survey.


Reply