As the Intel PTU is available on the WhatIf forum, this topic seems closer to topicality on that forum.
I use CodeAnalyst and am quite happy with it (especially the price $0).
You can only run timer based sampling. You cannot use event based sampling. Timer based sampling is usualy what is chosed for your first assalt on optimization. Using event based sampling (e.g. cache evictions) are genneraly performed later when attempting to eek the last few percentages points out of the code or when there is an adverse cache interaction that your are attempting to locate.