Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
4974 Discussions

Call stack mechanism implementation question

Richard_H_1
Beginner
479 Views

I am running a Go program with dwarf information and VTune does a good job figuring out line numbers and so forth but it struggles with stack walks. I am guessing that it is because Go's stack conventions, how Go uses EBP for example, are different than those supported by Vtune. Is there a document or some sort of clue sheet about what Vtune expects from the stack formats? Also can anyone think of a work around that doesn't require Go changing its conventions.

0 Kudos
7 Replies
Peter_W_Intel
Employee
479 Views

Did you mean VTune stack sampling failed on Go program? Dwarf format is supported by VTune, as well as its calling convention (both for C/C++ and Fortran).

Can you provide Go program or other test case so I can verify this issue? 

0 Kudos
Richard_H_1
Beginner
479 Views

Since December we taught our go compiler about base pointers but we are still having problems with VTune. Not sure what to do next. I uploaded bench.tar which holds a program we compiled using the latest go compiler from golang.org built with a the environment variable GOEXPERIMENT=framepointer  which tells the compiler to generate base pointers. You can build such a system using the instructions at golang.org for building the latest development branch. 

I uploaded a binary built with the system in the bench.tar file.

To run bench simply extract it from the bench.tar file and execute ./bench -bench=garbage. Below is what to expect. When we run this in Vtune >90% of the stackframes are not recognized but oddly enough some are and those seem correct. Sometimes the call stacks are also correct but not always. We have other internal performance tools and gdb working that rely on our dwarf info so somehow vtune wants more/different information than gdb. Any clues would be appreciated.

Here is what to expect from the ./bench file in the tar file.

rlh@rlh0:~/work/code/src/golang.org/x/benchmarks/bench/xxx$ ./bench -bench=garbage -benchnum=1
consumption=1829KB npkg=14
2015/02/11 14:38:32 Benchmarking 1 iterations
2015/02/11 14:38:32 Benchmarking 100 iterations
2015/02/11 14:38:35 Benchmarking 500 iterations
2015/02/11 14:38:50 Result: {N:500 Duration:14.95397866s RunTime:29907957 Metrics:map[virtual-mem:324300800 rss:88551424 allocated:3130122 sys-total:98429240 sys-gc:5436275 gc-pause-total:1011241 sys-other:6354373 gc-pause-one:5378946 time:29907957 cputime:29936444 allocs:68124 sys-heap:86409216 sys-stack:229376] Files:map[cpuprof:/tmp/8.cpuprof memprof:/tmp/9.memprof memprof0:/tmp/7.memprof]}
GOPERF-METRIC:allocated=3130122
GOPERF-METRIC:allocs=68124
GOPERF-METRIC:cputime=29936444
GOPERF-METRIC:gc-pause-one=5378946
GOPERF-METRIC:gc-pause-total=1011241
GOPERF-METRIC:rss=88551424
GOPERF-METRIC:sys-gc=5436275
GOPERF-METRIC:sys-heap=86409216
GOPERF-METRIC:sys-other=6354373
GOPERF-METRIC:sys-stack=229376
GOPERF-METRIC:sys-total=98429240
GOPERF-METRIC:time=29907957
GOPERF-METRIC:virtual-mem=324300800
GOPERF-FILE:cpuprof=/tmp/10.prof.txt
GOPERF-FILE:memprof=/tmp/11.prof.txt
GC: #70 116534366ns @700793116853489 pause=21789741 maxpause=28947825 goroutines=1036 gomaxprocs=1
GC:     sweep term: 62842ns    max=88444    total=3790648    procs=1
GC:     scan:       318526ns    max=1840104    total=57351939    procs=1
GC:     install wb: 1573ns    max=3040    total=122680    procs=1
GC:     mark:       94426099ns    max=98579167    total=5672118299    procs=1
GC:     mark term:  21725326ns    max=28947825    total=1319757727    procs=1
rlh@rlh0:~/work/code/src/golang.org/x/benchmarks/bench/xxx$ 

 

 

0 Kudos
Peter_W_Intel
Employee
479 Views

Thank you for test case, is there any dependency for environment to run?

# ./bench -bench=garbage -benchnum=1 consumption=1829KB npkg=14
parse /usr/local/google/home/rlh/work/go/src/net/http open /usr/local/google/home/rlh/work/go/src/net/http: no such file or directory
panic: fail

goroutine 1 [running]:
golang.org/x/benchmarks/garbage.parsePackage(0x6c3160)
    /usr/local/google/home/rlh/work/code/src/golang.org/x/benchmarks/garbage/garbage.go:136 +0x3d7

......

I have no "google" sub-directory under /usr/local, what should software be installed? and where?  

 

0 Kudos
Richard_H_1
Beginner
479 Views

Sorry, the garbage benchmark reads from a directory in the standard go source tree. The splay should be self contained.

rlh@rlh0:~/xbenchx$ ./bench -bench=splay
2015/02/12 10:00:46 Benchmarking 1 iterations
2015/02/12 10:00:47 Benchmarking 100 iterations
2015/02/12 10:00:48 Benchmarking 2000 iterations
2015/02/12 10:01:02 Result: {N:2000 Duration:13.763610345s RunTime:6881805 Metrics:map[virtual-mem:208637952 rss:94838784 cputime:6962112 allocated:541428 allocs:10404 sys-total:103431216 sys-gc:5745011 time:6881805 sys-other:6132413 sys-stack:196608 gc-pause-total:358926 gc-pause-one:3901374 sys-heap:91357184] Files:map[memprof0:/tmp/7.memprof cpuprof:/tmp/8.cpuprof memprof:/tmp/9.memprof]}
2015/02/12 10:01:02 Benchmarking 1 iterations
2015/02/12 10:01:02 Benchmarking 100 iterations
2015/02/12 10:01:03 Benchmarking 1000 iterations
2015/02/12 10:01:10 Result: {N:1000 Duration:6.669404447s RunTime:6669404 Metrics:map[virtual-mem:217206784 cputime:6933084 allocs:10409 sys-total:112000048 sys-heap:99418112 sys-stack:196608 time:6669404 rss:110415872 allocated:542357 gc-pause-total:356282 sys-gc:6244723 sys-other:6140605 gc-pause-one:3958692] Files:map[memprof0:/tmp/16.memprof cpuprof:/tmp/17.cpuprof memprof:/tmp/18.memprof]}
2015/02/12 10:01:10 Benchmarking 1 iterations
2015/02/12 10:01:11 Benchmarking 100 iterations
2015/02/12 10:01:12 Benchmarking 2000 iterations
2015/02/12 10:01:26 Result: {N:2000 Duration:13.481776002s RunTime:6740888 Metrics:map[allocs:10404 sys-total:120577072 sys-heap:107479040 gc-pause-total:349784 sys-gc:6752627 sys-other:6148797 gc-pause-one:3974827 time:6740888 virtual-mem:225783808 rss:119345152 cputime:6820730 allocated:541250 sys-stack:196608] Files:map[memprof0:/tmp/25.memprof cpuprof:/tmp/26.cpuprof memprof:/tmp/27.memprof]}
2015/02/12 10:01:26 Benchmarking 1 iterations
2015/02/12 10:01:26 Benchmarking 100 iterations
2015/02/12 10:01:27 Benchmarking 2000 iterations
2015/02/12 10:01:41 Result: {N:2000 Duration:13.651746167s RunTime:6825873 Metrics:map[sys-heap:107479040 sys-stack:196608 sys-gc:6752627 sys-other:6148797 time:6825873 rss:119603200 allocs:10404 sys-total:120577072 gc-pause-one:3977067 virtual-mem:225783808 cputime:6904047 allocated:543145 gc-pause-total:353959] Files:map[memprof:/tmp/36.memprof memprof0:/tmp/34.memprof cpuprof:/tmp/35.cpuprof]}
2015/02/12 10:01:41 Benchmarking 1 iterations
2015/02/12 10:01:41 Benchmarking 100 iterations
2015/02/12 10:01:43 Benchmarking 2000 iterations
2015/02/12 10:01:57 Result: {N:2000 Duration:13.743759372s RunTime:6871879 Metrics:map[allocs:10404 sys-heap:107479040 gc-pause-total:355820 gc-pause-one:3953562 time:6871879 rss:119603200 cputime:6953678 allocated:541561 sys-total:120577072 sys-stack:196608 sys-gc:6752627 sys-other:6148797 virtual-mem:225783808] Files:map[memprof0:/tmp/43.memprof cpuprof:/tmp/44.cpuprof memprof:/tmp/45.memprof]}
Count=760400 Total=6.3791752639e+10 Min=11674 Mean=83892.36275512994 RMS=2.7770491889907755e+06 Max=1.48924784e+08 StdDev=2.775783565470609e+06
GOPERF-METRIC:allocated=542357
GOPERF-METRIC:allocs=10409
GOPERF-METRIC:cputime=6933084
GOPERF-METRIC:gc-pause-one=3958692
GOPERF-METRIC:gc-pause-total=356282
GOPERF-METRIC:rss=119603200
GOPERF-METRIC:sys-gc=6752627
GOPERF-METRIC:sys-heap=107479040
GOPERF-METRIC:sys-other=6148797
GOPERF-METRIC:sys-stack=196608
GOPERF-METRIC:sys-total=120577072
GOPERF-METRIC:time=6669404
GOPERF-METRIC:virtual-mem=217206784
GOPERF-FILE:cpuprof=/tmp/46.prof.txt
GOPERF-FILE:memprof=/tmp/47.prof.txt
GC: #472 115296766ns @770579699027981 pause=17217886 maxpause=19372756 goroutines=12 gomaxprocs=1
GC:     sweep term: 93399ns    max=93399    total=23879494    procs=1
GC:     scan:       259286ns    max=512635    total=101556244    procs=1
GC:     install wb: 2033ns    max=5115    total=714122    procs=1
GC:     mark:       97819594ns    max=124489478    total=43965078834    procs=1
GC:     mark term:  17122454ns    max=19372756    total=7429728046    procs=1
rlh@rlh0:~/xbenchx$ 

 

 

0 Kudos
Peter_W_Intel
Employee
479 Views

Thank you to correct this usage. Now, I can reproduce this issue -

# amplxe-cl -collect advanced-hotspots -knob collection-detail=stack-sampling -d 60 ./bench -bench=splay

There is no call stack info in bottom-up report, I will report the developer and get back to you as soon as I can.

 

0 Kudos
Richard_H_1
Beginner
479 Views

Any progress here?

- Rick

0 Kudos
Peter_W_Intel
Employee
479 Views

Our developer said, adding support for "Go" calling convention and call stack mechanism is a new feature request. I will put any update - if it will be ready...but it is hard to give a detail time frame. Thanks for your understandings!

0 Kudos
Reply