Re: VTune μPipe says 100% for all of Front-End Bound, Retiring, and Back-End Bound

aodaki · ‎02-25-2023

Hi,

I ran microarchitecture exploration with VTune Profiler to find out how the code size increase pressures the front-end. However, it says all of Front-End Bound, Retiring, and Back-End Bound consumes 100.0% of pipeline slots, and that doesn't make sense for me.

My understanding is that "Front-End Bound" represents the ratio of pipeline slots unused because front-end undersupplies the operations to the back-end. "Back-End Bound" represents the ratio of pipeline slots unused because back-end fails to process the operations the front-end supplies in time. "Retiring" is the ratio of pipeline slots actually utilized to process operations. Therefore, "Front-End Bound" + "Retiring" + "Back-End Bound" should be 100%. In reality, the number will not be exactly 100% because the profiler uses sampling, but it should still be approximately 100%. However, I'm now seeing all of Front-End Bound, Retiring, and Back-End Bound are 100% so the sum will be 300% and that contradicts the understanding.

So now I wonder:

- Is my understanding correct?
- Do anyone have an idea how this can happen?

Please tell me if you have a question about the configuration, or need an actual output.

Thanks in advance.

JaideepK_Intel · ‎02-27-2023

Hi,

Thank you for posting in Intel Communities.

Retiring + Front-End Bound + Bad Speculation + Back-End Bound may not be equal to 100%. This can happen due to the nature of the sampling methodology VTune takes. In general, the sampling methodology will not be able to provide 100% accurate data. Due to the complexities of implementation to cause underestimates or overestimates, using the multiple runs option could help approximate more accurate data. In general, the statistical portions of pipeline categories/classifications would serve as meaningful data to for categorizing the performance bottleneck problems.

If this resolves your issue, make sure to accept this as a solution. This would help others with similar issues. Thank you!

Regards,

Jaideep

aodaki · ‎02-27-2023

Hi,

I understand it is not perfectly accurate and has inherent errors. However, it still looks like something went seriously wrong as all of the values are 100% and the sum is 300%.

The multiple runs option is not available for me because I use driverless perf, but I can still extend the runtime by changing the command line arguments. With my current command line, the workload takes 300 seconds and it still says 100% for all of Retiring, Front-End Bound, and Back-End Bound.

JaideepK_Intel · ‎03-05-2023

Hi,

Good day to you.

To understand your issue better, could you please provide the below details?

Intel Vtune version, OS and processor details.
Could you please share a sample reproducer (Exact replica of your application) so that we can reproduce your issue from our end.
Please share the result directory.
Could you please run the same analysis using matrix sample (path:/opt/intel/oneapi/vtune/latest/samples/en/C++/matrix/) and let us know you are facing same issue?

Thanks,

Jaideep

aodaki · ‎03-06-2023

1. Intel VTune Profiler 2023.0.0. Fedora 37. Intel(R) Xeon(R) W-1370P.

2. I'm running: https://www.spec.org/cpu2017/Docs/benchmarks/505.mcf_r.html I don't think I can share the binary due to the license.

3. I attached the result file (.vtune). I couldn't attach the entire result directory as it is over 100MB even after compressed as ZIP and the allowed maximum size is 71MB. I can still share it with a different channel if you need it.

4. It works fine for the matrix sample.

JaideepK_Intel · ‎03-06-2023

Hi,

The attached zip file size is 1Kb. The file may have been corrupted; could you please reshare the file? You can share files in different channel.

Thanks,

Jaideep

aodaki · ‎03-08-2023

Here is a link to an archive of the entire result directory:
https://univtokyo-my.sharepoint.com/:u:/g/personal/6391442621_utac_u-tokyo_ac_jp/ERjzAsxUnkpHjF4PZhFnimMB5mTpNCLbj4J4yRNG-vX6Nw?e=qGfmWF

JaideepK_Intel · ‎03-16-2023

Hi,

Good day to you.

We can see you are using Fedora 37, which is not listed in the supported OS list of Vtune. may be this is causing the issue. can you try on supported os and let us know if issue still persists.

link:Intel® VTune™ Profiler System Requirements

Thanks,

Jaideep

aodaki · ‎03-16-2023

Hi,

I tried on Fedora 36 but the issue still remains.

JaideepK_Intel · ‎03-20-2023

Hi,

Good day to you.

To understand your issue better,can you confirm other analysis types are working fine? excpet uarch-exploration.

Thanks,

Jaideep

aodaki · ‎03-20-2023

Hi,

I haven't seen a problem with other analysis types so far.

JaideepK_Intel · ‎03-29-2023

Hi,

I hope you are doing well.

sorry for the delay, still we are working on your issue internally. We will get back to you with an update.

Thanks,

Jaideep

JaideepK_Intel · ‎08-27-2023

Hi,

Sorry for the delay, could you please check the same with the latest version of Vtune(2023.2.0) and please let us know if the issue still persists?

If you still face the same issue on the latest version of Vtune, please share the result directory(via drive) so we can debug further.

Thanks,

Jaideep

aodaki · ‎09-16-2023

It is reproducible with the latest Intel oneAPI Base Tookit (2023.2.0). I attached the archive of the result directory.

VTune μPipe says 100% for all of Front-End Bound, Retiring, and Back-End Bound

Intel VTune™ Profiler