- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi everyone,
I have a question concerning [[intel::max_concurrency(n)]] and cannot find an answer in the reference guide.
Is the loop with this directive scheduled at run-time on FPGAs or compile-time by the compiler? Based on the optimization report, the resources are estimated or determined at compile-time, of course. But it's not very clear, whether the iterations in the loop are dynamically scheduled at run-time, or scheduled already at compile-time, or other possibilities?
Many thanks in advance!
Xin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey @xwuupb
Changing the concurrency of the loop changes the hardware that gets produced. So this is a compile time change.
Reducing the concurrency of a loop reduces the resource usage for that loop (because it needs to be able to handle fewer concurrent iterations).
Now it is called "max_concurrecy" and not "concurrency" for a reason.
Your loop may now not have a new iteration ready to execute at each cycle.
This may be due to different reasons such as a blocking pipe read or any upstream component that stalls your loop.
So there is also a run-time aspect to how loop iterations are scheduled.
The reason why you can't find the exact answer you are looking for is because it would not make sense to artificially limit the concurrency of a loop at runtime.
If you have the hardware that allows you to interleave as many iterations as possible, there is no reason to limit it at run time.
So the entire point of this attribute is to tell the compiler that it can limit the maximum concurrency because you know that the consumer of that the result being computed in this loop does not require the result to be computed so fast.
Therefore, you can make hardware savings by limiting the max concurrency (compile time).
The page linked above by @hareesh says "The max_concurrency attribute enables you to control the on-chip memory resources required to pipeline your loop." which implicitly tells you this is done at compile time.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Xin,
Seem this is related to oneAPI, I am transferring this to right owner expertise and you will expect a reply soon.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
please go through bellow document 4.6.5
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, your answer is not helpful at all. Because neither run-time nor compile-time is mentioned in this document regarding the schedule of loop iterations using [[intel::max_concurrency(n)]].
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
if you don't mind can you please tell me exactly what type of issue are you facing and what information do you needed?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, the information I need (but not found in relevant oneAPI manuals) is clearly written in the questions in the first post. To repeat the two questions:
- Is the loop with this directive scheduled at run-time on FPGAs or compile-time by the compiler?
- But it's not very clear, whether the iterations in the loop are dynamically scheduled at run-time, or scheduled already at compile-time, or other possibilities?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
1) Is the loop with this directive scheduled at run-time on FPGAs or compile-time by the compiler?
Answer: max_concurrency attribute are implemented to pipeline the loop for singel task kernel, hence my guess it would be programme to execute in the hardware runtime to provide an estimation of area used.
2) But it's not very clear, whether the iterations in the loop are dynamically scheduled at run-time, or scheduled already at compile-time, or other possibilities?
Answer: as mention above.
More details explaining the functionality of the attributes can be found in the link below:
Hope that clarify.
Thanks,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, thanks for your answers.
> my guess it would be programme to execute in the hardware runtime to provide an estimation of area used.
Do you have additional references or manuals (besides the link you give below) for it? "my guess ... " sounds not solid.
> More details explaining the functionality of the attributes can be found in the link below:
I have read the link given in your answer. However, it is not mentioned, whether the loop with max_concurrency is scheduled at run-time or compile-time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey @xwuupb
Changing the concurrency of the loop changes the hardware that gets produced. So this is a compile time change.
Reducing the concurrency of a loop reduces the resource usage for that loop (because it needs to be able to handle fewer concurrent iterations).
Now it is called "max_concurrecy" and not "concurrency" for a reason.
Your loop may now not have a new iteration ready to execute at each cycle.
This may be due to different reasons such as a blocking pipe read or any upstream component that stalls your loop.
So there is also a run-time aspect to how loop iterations are scheduled.
The reason why you can't find the exact answer you are looking for is because it would not make sense to artificially limit the concurrency of a loop at runtime.
If you have the hardware that allows you to interleave as many iterations as possible, there is no reason to limit it at run time.
So the entire point of this attribute is to tell the compiler that it can limit the maximum concurrency because you know that the consumer of that the result being computed in this loop does not require the result to be computed so fast.
Therefore, you can make hardware savings by limiting the max concurrency (compile time).
The page linked above by @hareesh says "The max_concurrency attribute enables you to control the on-chip memory resources required to pipeline your loop." which implicitly tells you this is done at compile time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your detailed explanation. You solved my question.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I think you got solution if you don't have any queries about this issue i'll close this case. please conform .
Thanks,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Still are you facing problem?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If any answer from the Intel Support is helpful, please feel free to provide ratting with 9/10 survey.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page