Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
5019 Discussions

VTune μPipe says 100% for all of Front-End Bound, Retiring, and Back-End Bound

aodaki
Beginner
2,605 Views

Hi,

 

I ran microarchitecture exploration with VTune Profiler to find out how the code size increase pressures the front-end. However, it says all of Front-End Bound, Retiring, and Back-End Bound consumes 100.0% of pipeline slots, and that doesn't make sense for me.

 

My understanding is that "Front-End Bound" represents the ratio of pipeline slots unused because front-end undersupplies the operations to the back-end. "Back-End Bound" represents the ratio of pipeline slots unused because back-end fails to process the operations the front-end supplies in time. "Retiring" is the ratio of pipeline slots actually utilized to process operations. Therefore, "Front-End Bound" + "Retiring"  + "Back-End Bound" should be 100%. In reality, the number will not be exactly 100% because the profiler uses sampling, but it should still be approximately 100%. However, I'm now seeing all of Front-End Bound, Retiring, and Back-End Bound are 100% so the sum will be 300% and that contradicts the understanding.

 

So now I wonder:

- Is my understanding correct?
- Do anyone have an idea how this can happen?

 

Please tell me if you have a question about the configuration, or need an actual output.

 

Thanks in advance.

Labels (1)
0 Kudos
13 Replies
JaideepK_Intel
Moderator
2,564 Views

Hi,

 

Thank you for posting in Intel Communities.

 

Retiring + Front-End Bound + Bad Speculation + Back-End Bound may not be equal to 100%. This can happen due to the nature of the sampling methodology VTune takes. In general, the sampling methodology will not be able to provide 100% accurate data. Due to the complexities of implementation to cause underestimates or overestimates, using the multiple runs option could help approximate more accurate data. In general, the statistical portions of pipeline categories/classifications would serve as meaningful data to for categorizing the performance bottleneck problems.

 

If this resolves your issue, make sure to accept this as a solution. This would help others with similar issues. Thank you!

 

Regards,

Jaideep

 

 

0 Kudos
aodaki
Beginner
2,550 Views

Hi,

 

I understand it is not perfectly accurate and has inherent errors. However, it still looks like something went seriously wrong as all of the values are 100% and the sum is 300%.

 

The multiple runs option is not available for me because I use driverless perf, but I can still extend the runtime by changing the command line arguments. With my current command line, the workload takes 300 seconds and it still says 100% for all of Retiring, Front-End Bound, and Back-End Bound.

0 Kudos
JaideepK_Intel
Moderator
2,500 Views

Hi,


Good day to you.


To understand your issue better, could you please provide the below details?


  1. Intel Vtune version, OS and processor details.
  2. Could you please share a sample reproducer (Exact replica of your application) so that we can reproduce your issue from our end.
  3. Please share the result directory.
  4. Could you please run the same analysis using matrix sample (path:/opt/intel/oneapi/vtune/latest/samples/en/C++/matrix/) and let us know you are facing same issue?


Thanks,

Jaideep



0 Kudos
aodaki
Beginner
2,492 Views

1. Intel VTune Profiler 2023.0.0. Fedora 37. Intel(R) Xeon(R) W-1370P.

2. I'm running: https://www.spec.org/cpu2017/Docs/benchmarks/505.mcf_r.html I don't think I can share the binary due to the license.

3. I attached the result file (.vtune). I couldn't attach the entire result directory as it is over 100MB even after compressed as ZIP and the allowed maximum size is 71MB. I can still share it with a different channel if you need it.

4. It works fine for the matrix sample.

0 Kudos
JaideepK_Intel
Moderator
2,484 Views

Hi,

 

The attached zip file size is 1Kb. The file may have been corrupted; could you please reshare the file? You can share files in different channel.

 

Thanks,

Jaideep

 

0 Kudos
JaideepK_Intel
Moderator
2,291 Views

Hi,

Good day to you.

We can see you are using Fedora 37, which is not listed in the supported OS list of Vtune. may be this is causing the issue. can you try on supported os and let us know if issue still persists.

link:Intel® VTune™ Profiler System Requirements

JaideepK_Intel_0-1678952742194.png

 

 

Thanks,

Jaideep

 

0 Kudos
aodaki
Beginner
2,267 Views

Hi,

I tried on Fedora 36 but the issue still remains.

0 Kudos
JaideepK_Intel
Moderator
2,234 Views

Hi,


Good day to you.


To understand your issue better,can you confirm other analysis types are working fine? excpet uarch-exploration.


Thanks,

Jaideep


0 Kudos
aodaki
Beginner
2,229 Views

Hi,

 

I haven't seen a problem with other analysis types so far.

0 Kudos
JaideepK_Intel
Moderator
2,166 Views

Hi,


I hope you are doing well.


sorry for the delay, still we are working on your issue internally. We will get back to you with an update.


Thanks,

Jaideep




0 Kudos
JaideepK_Intel
Moderator
1,822 Views

Hi,


Sorry for the delay, could you please check the same with the latest version of Vtune(2023.2.0) and please let us know if the issue still persists?


If you still face the same issue on the latest version of Vtune, please share the result directory(via drive) so we can debug further.


Thanks,

Jaideep



0 Kudos
aodaki
Beginner
1,728 Views

It is reproducible with the latest Intel oneAPI Base Tookit (2023.2.0). I attached the archive of the result directory.

0 Kudos
Reply