One important note about wait time in the VTune analyzer -check out this online help definition of Self-wait Time:
The call graph calculation of the Self_Wait_Time is based on a heuristic estimation that tracks context switches caused by synchronization events and by other causes.
Note: Windows* 2000 multi-processors support the exact Wait time information collection.
You probably can't trust it to give you exact timing information, however, the timing information will be relatively correct. That is, the timing numbers for all functions will be representative of the impact, so that call graph will identify which functions are costing the most time. This can point you to where you should focus your optimization efforts.
Just my opinion, :-}