I supposed the PMI will be issued after STI instruction
But it seems not "always" be interrupted at the same address ,
for example , i set up the corresponding MSRs for monitoring FAR Branch with Ring 0 privilege level.
It supposed interrupt in the following instruction (0x0000000014006EA98) after Ring 3 issue a syscall ,
but the fact is that, it will be interrupted after STI , but not actually at specific instruction, such as 0x0000000014006EAAD, what is the problem ?
I supposed after the STI is enable the interruption, then CPU will be interrupted by PMI immediately, isn't it?
STI will be delayed? or what is the problem?
Please tell me if you know ;)
In an out-of-order processor, there is almost always a delay between the event that causes an interrupt and the handling of the interrupt. This causes the interrupt handler to "see" a program counter that is after the program counter of the instruction that caused the interrupt.
The phenomenon is usually called "skid", and you can find several discussions of the topic in Chapter 18 of Volume 3 of the Intel Architectures Software Developer's Manual (document 325384). Some of the performance counter events have been enhanced to provide additional data and to reduce "skid". These events, the processors that introduced them, and limitations on their use are all discussed in Chapter 18.
Thank you for answering this question.
More question , So the "skid" is not improvable by software , is that right??
And one more phenomenon i found that, is if I make a INT 3 interrupt between each Syscall, the skid will be relatively reduced , almost immediately interrupt after STI instruction , what is reason about this phenomenon??
For example :
for( i = 0 ; i < 1000000 ; i++)
You mean the root cause of the "skid" is due to STI instruction delayed?
For my understanding , STI will "immediately" enable interrupt ,
(1) Is there a anyway for solving a problem of the skid of FAR BRANCH
(2) Why INT 3 could make the next Syscall very occurate.??
(3) Will STI not immediately enable interrupt , isn't it?
Very appreciate for answering question , John.
Read the instruction description in Volume 2 of the SW Developer's manual.
The description says that interrupts will be enabled after the instruction following the STI instruction. That is exactly what you are seeing.
Yes, I know STI will be delay one instruction. But the phenonomeon i noticed that is, the interrupt maybe placed after more instruction , it should be you mentioned "skid" , is it no solution for skid ?
There is no general solution for skid in out-of-order processors.
According to the discussions in Chapter 18 of Volume 3 of the Intel Architectures Software Developers Manual, recent Intel processors support an enhancement to Processor Event-Based Sampling (PEBS) called Precise Distribution of Instructions Retired (PDIR). This applies only to the "INST_RETIRED.ALL" performance counter event. It has several additional limitations as well, as discussed in Chapter 18.
Thanks a lot , John , You answer is really helpful:)
I should be going to cover as wide as possible for different RIP which maybe interrupted ;((
maybe it is only things what can I do to get over the "SKID"
But John, there is other phenomenon that I cannot explain.
I have found that is, if make a software breakpoint after every syscall , and the "skid" will be extremely reduced ,
do you have any idea??
If something reduces skid, it probably does so by decreasing the ability of the processor to execute instructions out of order.
The single-byte form of the "INT 3" instruction (opcode 0xCC) is a special (simplified) case of the more general INT instruction, but even the simple case has fairly complex behavior -- see the discussion of the INT instruction in Volume 2 of the Intel Architectures Software Developers Manual. This complex behavior probably means that the instruction is microcoded and takes a number of cycles to complete. This seems likely to make it hard for the processor to do enough out-of-order processing to move the program counter very far, so the skid will be reduced.
These are just guesses -- I don't know a lot about how interrupts are implemented on Intel processors.
Thank you for your answering, John.
But why is that Out-of-Order Execution will cause a PMI delay??
Retirement Unit is supposed to make sure the consistency with Original Instruction order.
And i supposed after the syscall PMC0 will incremented by 1 and overflow (assume it set to be -1),
And after the "STI" instruction, the first instruction retired, PMI is issued
But the fact tell me it is wrong guess, but I'm not sure why is that?
It is less an issue of out-of-order processing than it is of propagation delays across the chip. The performance monitoring interrupt must come from the performance monitoring unit, which can't be "close to" all of the other functional units in the core.
A lot of work goes into making sure that "exceptions" are handled precisely. An exception is raised by the functional unit that is executing the instruction, while the instruction is still in the pipeline, so there is no ambiguity about which instruction to point to.
An "interrupt" is not raised by the unit executing the instruction. Interrupts are typically completely asynchronous, or in this case the interrupt is generated by a different functional unit than the unit that executed the instruction that generated the interrupt. The PMU only knows that it is generating an interrupt on the overflow of a counter -- it does not have any knowledge of which functional unit executed the instruction that caused the overflow to happen.
I'm keeping on the research for explaining why INT 3 will almost totally reduced the skid and I recently found out this in Intel SDM, do you think it is related to the scene ??
I assume if the instruction stream exist "INT" instruction it will be forced in-order execution (just assume) , but i can't explain that why it will be able to reduce the skid , even the instruction is in-order executed , any brain-storming?
Pseudo Instruction Stream:
CALL System Func