Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)

amplxe-cl inclusive and exclusive HW event data

Prasanna_B_
초급자
1,365 조회수

Dear Vtune experts,

In the vtune GUI documentation, I found that Top-Down Tree window displays inclusive and exclusive performance data from the perspective of the function call stacks during execution. I would like to obtain the inclusive and exclusive HW event data via command line. Currently, I use the following command and it generates report on exclusive data but I want to generate inclusive data on the HW counters.

HW_COUNTERS="CPU_CLK_UNHALTED,INSTRUCTIONS_EXECUTED,VPU_INSTRUCTIONS_EXECUTED,VPU_ELEMENTS_ACTIVE,EXEC_STAGE_CYCLES,DATA_READ_OR_WRITE,DATA_READ_MISS_OR_WRITE_MISS,L2_DATA_READ_MISS_CACHE_FILL,L2_DATA_WRITE_MISS_CACHE_FILL,L2_DATA_READ_MISS_MEM_FILL,L2_DATA_WRITE_MISS_MEM_FILL,L2_VICTIM_REQ_WITH_DATA,HWP_L2MISS,L1_DATA_HIT_INFLIGHT_PF1,SNP_HITM_L2,DATA_PAGE_WALK,LONG_DATA_PAGE_WALK"

export I_MPI_FABRICS=shm
export I_MPI_PIN_DOMAIN=omp
export I_MPI_MIC=1
export OMP_NUM_THREADS=120
export I_MPI_DEBUG=5

amplxe-cl -collect-with runsa-knc -knob event-config=$HW_COUNTERS --target-duration-type=veryshort --search-dir all:rp=./ mpirun -f mpi_hosts -np 1 bash -c "ulimit -s unlimited && ./app.exe" > res-01rpn-120tpr.log

amplxe-cl -report hotspots -r r000runsa_knc -format text -group-by function -csv-delimiter comma -report-output result.csv

Thanks a lot

 

0 포인트
1 솔루션
Peter_W_Intel
직원
1,365 조회수

However in amplxe-cl, when I generate the report, I only get Self values for HW event

Current product U16 has no "Total by HW event type", see release notes.

VTune™ Amplifier XE data collection on Intel® Xeon Phi™ coprocessor (codename: Knights Corner) currently is limited to hardware event-based sampling data collected from target units (200179057)

  • No information on function call stacks is recorded during collection. However, you may mistake partial call chains appearing in result Groups for real call stack information. These partial chains are the result of inline function information in debug symbol tables and can be ignored.

원본 게시물의 솔루션 보기

0 포인트
6 응답
Peter_W_Intel
직원
1,365 조회수

If you want to see HW event data, please use -

amplxe-cl -report hw-events -r r000runsa_knc -format text -group-by function -csv-delimiter comma -report-output result.csv

0 포인트
Prasanna_B_
초급자
1,365 조회수

Thanks Peter.

Sorry, I forgot to include the last line of my script:

amplxe-cl -report hw-events -group-by function -r r000runsa_knc -format csv -csv-delimiter comma -report-output hw_results.csv

However, with this command, I get exclusive HW event data for each function. I would like to generate inclusive HW event. Is there a flag that I have to add/enable/disable during data collection/report generation?

Thanks for your help.

 


 

0 포인트
Peter_W_Intel
직원
1,365 조회수

I don't know which inclusive event count cannot be displayed, if you use "hw-events" instead of "hotspots"...in report.

What is VTune version you are using? Latest product is U16. Can you please tell me which event data cannot be displayed. Thank you.

Regards, Peter 

0 포인트
Prasanna_B_
초급자
1,365 조회수

Peter,

I have U 15.

In the vtune GUI documentation i saw the following:

Top-down Tree window displays hotspot functions in the call tree, performance metrics for a function only (Self value) and for a function and its children together (Total value).

However in amplxe-cl, when I generate the report, I only get Self values for HW event

For example, VPU_ELEMENTS_ACTIVE:Self,Hardware Event Count:CPU_CLK_UNHALTED:Self,..etc. There is no "Total".

Do you think its a version issue?

Thanks

0 포인트
Peter_W_Intel
직원
1,366 조회수

However in amplxe-cl, when I generate the report, I only get Self values for HW event

Current product U16 has no "Total by HW event type", see release notes.

VTune™ Amplifier XE data collection on Intel® Xeon Phi™ coprocessor (codename: Knights Corner) currently is limited to hardware event-based sampling data collected from target units (200179057)

  • No information on function call stacks is recorded during collection. However, you may mistake partial call chains appearing in result Groups for real call stack information. These partial chains are the result of inline function information in debug symbol tables and can be ignored.
0 포인트
Prasanna_B_
초급자
1,365 조회수

Peter,

Thanks for clarification.

0 포인트
응답