Do you understand why your program is so unstable? Within any degree of uncertainty you can average multiple measurements to measure an average and a variability. If the variability is inherent in the algorithm, you'll need to take multiple measurements before and after changes and see if the average changes and determine if that is significant given the variability.
Does this variability actually shift the location of the hot spots Vtune analyzer identifies from run to run? Or are there any significant hot spots?