Please let me state upfront that I'm well familiar with RDTSC, QueryPerformanceCounter and all sorts of Win32 timing sources.
I'm desperately looking for a hi-res time source that fulfills the following:
1st) work reliably generally (QPC doesn't. I know it's touted as the greatest thing since sliced bread but it is a fact that itdoes jump around like crazy on certain machines depending on load)
2nd) work reliably on multicore/multiproc machines (RDTSC doesn't since the clock count may differ for different CPUs and they aren't always synced)
3rd) provideat leasta resolution of 0.1ms, preferrably better. Equivalent clock cycles are fine too, it doesn't have to be exactly wall-clock time.
4th) work on 32bit and 64bit (x64 not IA64)Intel (>=P4) and AMD (AthlonXP/Opteron)processors. It doesn't have to be the exact OpCode/API call but it has to be available somehow on all these platforms for W2k3 Server and WinXP.
5th) works more or less out of the box on a Win32/Win64 server without further software to install
6th) does not require instrumentation or such stuff (we're talking about monitoring a live production system here)
I'm well aware that there's not much reason for Intel to provide problem solutions that work well on AMD hardware, but then again there's this stupid thing called reality.
Reality is also the one thing that kills using QPC or RDTSC. Both I can prove to misbehave on one ore more relevant multicore platforms.
Bonus points for a not requiring a kernel transition, but I'd be really happy to have at least one way to measure cross-core timing for procedures that run in less than 1ms.