NOTE: many of these events are known to overcount (l1d_cache_ld, l1d_cache_lock) sothey can only be used for qualitative analysis.
Thanks for explaining your requirements!
I agree that many events are overlapped...but the user should select them adequately...
In my view:
L1D.REPL is for L1D cache line flushing, driven by page fault and TLB will translate/reload data to L1D
L1D_CACHE_LD.I_STATE counts all L1D misses, that is what you want.
L1D cache miss happens - it doesn't mean L1D page fault.