Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
5135 Discussions

Call Graph Failure - Bistro Crashed?

mikemartell
Beginner
1,083 Views
It appears that Bistro has crashed. I get this in the output window:

Thu Jul 08 09:56:45 2010 Static instrumentation done
Thu Jul 08 09:56:45 2010 Warning for module "c:\\windows\\system32\\ntdll.dll" - Instrumented module name must be identical to the original module name.
Thu Jul 08 09:56:45 2010 Error - Program crashed.
Thu Jul 08 09:56:45 2010 Data collection halted...

The bistro.log file is attached. Please advise.
0 Kudos
12 Replies
Peter_W_Intel
Employee
1,083 Views
Hello,

Thanks for your log file!

Since there were manyinfo like as "Instrumentation specified not applicable", I suspect that you may work on Windows* 7 system,

Now VTune Performance Analyzer v9.1 U8 doesn't support Call Graph on Windows* 7 and Windows* Server 2008. Please use sampling data collection instead, oruse call graph on Windows* XP or Windows* Server 2003.

Thanks, Peter
0 Kudos
mikemartell
Beginner
1,083 Views
Yes, I'm using Windows 7. Thanks for letting me know about this. Is there a schedule for call graph support on Windows 7?
0 Kudos
Peter_W_Intel
Employee
1,083 Views
Currently the user can use Intel? Parallel Amplifier instead. Hotspot results in this tool can provide time spending info of hot functions,as well ascorresponding call stack info. Please click Here

Regards, Peter
0 Kudos
aap
Beginner
1,083 Views
On Solaris I use Collector when I want to get a quick idea of what is taking time, and Quantify when I need call counts. For some issues Collector is enough while for others more specific information is needed. I had imagined that someday IPA and VTune will have a similar relationship on Windows, so when I first read that VTune call graph cannot work on Windows 7 I felt let down.

This is a big functional loss. IPA is interesting but it is young and does not yet match the usefulness of VTune call graph. To some extent it probably never will. Sometimes to fully understand the cause of a performance issue I really need to see the number of times each function was called, which as far as I understand just can't be done using sampling. What is the issue that prevents VTune instrumentation-based call graph from working on Windows 7? Is there any hope that it will be resolved some day?
0 Kudos
David_A_Intel1
Employee
1,083 Views
As you might imagine, we cannot speculate on future products. However, did you see the announcement at the top of the forum?
0 Kudos
aap
Beginner
1,083 Views
MrAnderson,

I did see the announcement. However it didn't do much to comfort me. Of course I am glad to see both products move forward but the announcement seems to focus on preserving event based sampling and does not mention call graph at all. Also the notion that I should download IPA to learn about the future direction of VTune is somewhat disturbing. In combination with the fact that Call Graph is not supported on Win7 it leads me to suspect that Intel does not intend to ever try to support instrumentation-based call graph profiling on Win7. If that's true it means I will have to keep some XP x64 boxes around for much longer than I otherwise would.

I can certainly understand not wanting to comment on enhancements in future products. Still, there is something to be said for reassuring current customers that you are not abandoning the functionality they rely on.

Regards,
aap
0 Kudos
David_A_Intel1
Employee
1,083 Views
Well, let me ask you this, can you explain why you need the functionality of call graph. What exactly, besides, call counts, can you get from call graph that you can't get from stack sampling? And, what exactly do you use call counts for?

We want to meet the needs of our customers. Sometimes we need our customers to educate us on their needs. ;)
0 Kudos
aap
Beginner
1,083 Views
The call counts are the main thing I get from call graph that I don't get from stack sampling.

We have had multiple on-site visits from Intel over the past dozen years or so, and each time I explain that we don't use any of VTune's sampling features. The sampling features seem to be designed for analyzing relatively small routines in which the processor spends a lot of time, perhaps looping, so that it can be worthwhile to tweak the code this way and that way to take better advantage of processor features.

The code I work with is a 3D mechanical CAD application. The main executable is over 200 MB. When it is slow, it tends to be not so much in one function as in the interaction of a few dozen functions. I am looking for opportunities for algorithmic improvements, mostly related to how and when the functions call each other -- not really so much about what happens within one particular function.

As I said, sometimes stack sampling without call counts is enough to see what is going on, and sometimes it is not. Much of the time when I am concerned with performance it is because our QA team has noticed that a certain operation is slower than it used to be. So I run the affected workflow under VTune on both builds and try to compare the results to help find what changed. In this context the call counts are often critical. Normally I am looking for what changed between the two builds in question and the call counts help me zero in on the changed code. They help me tell the difference between two cases: (a) the lower-level code got slower and (b) the higher-level code is calling the lower-level code more often than it was before. Once I know which kind of case I am looking at, I can focus on either the upper or lower level code and search for changes that got submitted to that area in the slower build.

Regards,
aap
0 Kudos
dehvidc1
Beginner
1,083 Views

I'm also trying to use VTune on Windows 7. I get a similar error graph trying to do call graphs:

Static instrumentation done
Fri Aug 06 18:34:38 2010 Warning for module "c:\windows\system32\ntdll.dll" - Maximal instrumentation is None.

Fri Aug 06 18:34:38 2010 Error - Program crashed.

Fri Aug 06 18:34:38 2010 Data collection halted...

I've tried using Parallel Amplifier in the MSVS2008 environment on my C++ application. I followed the Getting Started instructions. When I run the application the output window says Outside any known module and Unknown Frames. I can see code OK using the matrix sample application.

Any suggestions?

Thanks

David

0 Kudos
dehvidc1
Beginner
1,083 Views

I now have source showing with VTunesampling and monitoring after building with the Intel C++ compiler under MSVS2008. The change to Intel C++ was remarkably simple. (I had a few problems moving the code from Linux to MSVC++.)

Regards

David

0 Kudos
dehvidc1
Beginner
1,083 Views
Amajor benefit from using call graph analysis is to detect when a function is being called too many times, unnecessarilyor from the wrong place. In code I've optimised (on large projects with a multinational developer cohort) it's not uncommon in my experience to find situations where the correct output is being generatedbut the developer hascode that calls a method more times than necessary or is calling something unnecessarily.

On that note, an optimisation technique that can give excellent results is to step through the code with a source level debugger. Gives you a terrific sense of what's going on and can help to identify redundancies etc

Regards

David
0 Kudos
noteme
Beginner
1,083 Views
...

I suspect that you may work on Windows* 7 system,

Now VTune Performance Analyzer v9.1 U8 doesn't support Call Graph on Windows* 7 and Windows* Server 2008. Please use sampling data collection instead, or use call graph on Windows* XP or Windows* Server 2003.

Sorry to resurect this thread. But we have newly upgraded to Windows 7 as well, and it looks like we have the same problem. So I was just wondering, if there is any news on this? Using Call Graph profiling, has been our favorite way of profiling as well.

Main reasons:

- Much simpler to navigate up and down the call tree. Also helps to optimize the right function. Maybe the function showing most time spent is not the biggest problem, but the one calling it the most times should cach the return instead.

- When you see how many times a function is called, it's easier to find out what what kind of optimization is needed. Caching, change algorithm, or maybe even force inlining would help. It's not given if for example inlining would help at all when you just see the total time spent in a function.

Maybe I just don't know how to use sampling the right way, but I can't see how to find out these things when using sampling.

Thanks in advance,
yvind

0 Kudos
Reply