- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The VTune and Advisor analyses run extremly fast when compared to my non-parallelized raw C++ code in VS2013.
Does VTune/Advisor use their own switch settings? Does the Intel analyses actually run the code or simulate it?
The code is an electromagnetic simulation and only generates matrices and has nothing to do with graphics. I have an AMD 8-core CPU with two Radeon 6990 boards (4 GPUs total because they each have 2 GPUs). I haven't parallized anything in the code yet. I need to understand the fundamentals of what I'm seeing first.
I'm using VS2013 with Intel Composer XE SP1
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>> Does the Intel analyses actually run the code or simulate it?>>>
Actually this is a cooperation of kernel mode modules (producers) and user mode modules (consumers). Your code is probably scheduled to run from inside the VTune UI and kernel mode modules are gathering CPU stats by reading and writing MSR registers. Specific module vtss.sys is used probably to walk thread stacks(kernel and user mode) and resolve function calls.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jimmy,
Could you please give more details - do you use "Basic hotspots" analysis type?
Thanks & Regards, Dmitry
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Dmitry,
I've used Basic Hotspots, Advanced Hotspots, and any of the Advisor tools. The code runs and cycles through expected screen outputs in any of these analyses in under a second. If I run the code by itself, it'll take several minutes.
There's something fundamental that I'm missing about it's use. Iliy post is insightful though the program does appear to run as expected, just faster.
Jimmy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, the program runs faster within VTune and it generates the expected outputs (data files and screen output).
I can run the program in VTune and get great performance. The executable is about 15x slower than if I run it in VTune. Additionally running it from the program debugger (F5 from within VS2013) is about 100x slower.
I'd like to identify how I can match the performance of the VTune run code from the executable and a get the Visual Studios run release version somewhere close.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi jimmyajones:
VTune Amplifier never "simulates" your code. However, depending on your analysis type, it may inject some code. Under what analysis type(s) are you seeing this behavior?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
AFAIK some kind of instrumentation is injected inside the profiled process address space,but I do not know how it can contribute to increased code performance. Maybe it is dependent of specific analysis type as @MrAnderson hints.
>>>Additionally running it from the program debugger (F5 from within VS2013) is about 100x slower.>>>
In this case VS debugger can perform additional checks on memory buffers and stack guard pages moreover it can execute breakpoints and attempt to handle various exceptions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks again Iliy. I did just that by putting a timer in the code and running it through different processes.
Human error and inexperience prevail in this case. The Intel VTune and Intel Survey Analysis all had virtually identical run times to the executable when run from the prompt. My failure was in two parts: #1 running an older release version from the prompt which was had 15x longer run-time (configuration control); and #2 not having my release configuration setup properly in VS to run at a speed representative of the executable.
Sorry to have wasted your time. I'm sure there are more satisfying problems to solve on the forum. You did give me an insight to the Intel tools which was, in part, what I was seeking.
Thanks,
Jimmy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page