Thu Jul 08 09:56:45 2010 Static instrumentation done
Thu Jul 08 09:56:45 2010 Warning for module "c:\\windows\\system32\\ntdll.dll" - Instrumented module name must be identical to the original module name.
Thu Jul 08 09:56:45 2010 Error - Program crashed.
Thu Jul 08 09:56:45 2010 Data collection halted...
The bistro.log file is attached. Please advise.
Thanks for your log file!
Since there were manyinfo like as "Instrumentation specified not applicable", I suspect that you may work on Windows* 7 system,
Now VTune Performance Analyzer v9.1 U8 doesn't support Call Graph on Windows* 7 and Windows* Server 2008. Please use sampling data collection instead, oruse call graph on Windows* XP or Windows* Server 2003.
This is a big functional loss. IPA is interesting but it is young and does not yet match the usefulness of VTune call graph. To some extent it probably never will. Sometimes to fully understand the cause of a performance issue I really need to see the number of times each function was called, which as far as I understand just can't be done using sampling. What is the issue that prevents VTune instrumentation-based call graph from working on Windows 7? Is there any hope that it will be resolved some day?
I did see the announcement. However it didn't do much to comfort me. Of course I am glad to see both products move forward but the announcement seems to focus on preserving event based sampling and does not mention call graph at all. Also the notion that I should download IPA to learn about the future direction of VTune is somewhat disturbing. In combination with the fact that Call Graph is not supported on Win7 it leads me to suspect that Intel does not intend to ever try to support instrumentation-based call graph profiling on Win7. If that's true it means I will have to keep some XP x64 boxes around for much longer than I otherwise would.
I can certainly understand not wanting to comment on enhancements in future products. Still, there is something to be said for reassuring current customers that you are not abandoning the functionality they rely on.
We want to meet the needs of our customers. Sometimes we need our customers to educate us on their needs. ;)
We have had multiple on-site visits from Intel over the past dozen years or so, and each time I explain that we don't use any of VTune's sampling features. The sampling features seem to be designed for analyzing relatively small routines in which the processor spends a lot of time, perhaps looping, so that it can be worthwhile to tweak the code this way and that way to take better advantage of processor features.
The code I work with is a 3D mechanical CAD application. The main executable is over 200 MB. When it is slow, it tends to be not so much in one function as in the interaction of a few dozen functions. I am looking for opportunities for algorithmic improvements, mostly related to how and when the functions call each other -- not really so much about what happens within one particular function.
As I said, sometimes stack sampling without call counts is enough to see what is going on, and sometimes it is not. Much of the time when I am concerned with performance it is because our QA team has noticed that a certain operation is slower than it used to be. So I run the affected workflow under VTune on both builds and try to compare the results to help find what changed. In this context the call counts are often critical. Normally I am looking for what changed between the two builds in question and the call counts help me zero in on the changed code. They help me tell the difference between two cases: (a) the lower-level code got slower and (b) the higher-level code is calling the lower-level code more often than it was before. Once I know which kind of case I am looking at, I can focus on either the upper or lower level code and search for changes that got submitted to that area in the slower build.
I'm also trying to use VTune on Windows 7. I get a similar error graph trying to do call graphs:
Static instrumentation done
Fri Aug 06 18:34:38 2010 Warning for module "c:\windows\system32\ntdll.dll" - Maximal instrumentation is None.
Fri Aug 06 18:34:38 2010 Error - Program crashed.
Fri Aug 06 18:34:38 2010 Data collection halted...
I've tried using Parallel Amplifier in the MSVS2008 environment on my C++ application. I followed the Getting Started instructions. When I run the application the output window says Outside any known module and Unknown Frames. I can see code OK using the matrix sample application.
I now have source showing with VTunesampling and monitoring after building with the Intel C++ compiler under MSVS2008. The change to Intel C++ was remarkably simple. (I had a few problems moving the code from Linux to MSVC++.)
On that note, an optimisation technique that can give excellent results is to step through the code with a source level debugger. Gives you a terrific sense of what's going on and can help to identify redundancies etc
Sorry to resurect this thread. But we have newly upgraded to Windows 7 as well, and it looks like we have the same problem. So I was just wondering, if there is any news on this? Using Call Graph profiling, has been our favorite way of profiling as well.
- Much simpler to navigate up and down the call tree. Also helps to optimize the right function. Maybe the function showing most time spent is not the biggest problem, but the one calling it the most times should cach the return instead.
- When you see how many times a function is called, it's easier to find out what what kind of optimization is needed. Caching, change algorithm, or maybe even force inlining would help. It's not given if for example inlining would help at all when you just see the total time spent in a function.
Maybe I just don't know how to use sampling the right way, but I can't see how to find out these things when using sampling.
Thanks in advance,