How to enusre wbinvd complete?


Hi, I have a question about "wbinvd" instruction. 

According to Intel manual, it said "After executing this instruction, the processor does not wait for the external caches to complete their write-back and flushing operations before proceeding with instruction execution. It is the responsibility of hardware to respond to the cache write-back and flush signals. The amount of time or cycles for WBINVD to complete will vary due to size and other factors of different cache hierarchies."

I am wondering is there any method to force the program waits for unit the external cache flushing complete caused by WBINVD instruction? Then we can measure the overhead of WBINVD instruction. 

Any suggestion will be helpful. Thanks~

