- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
I've started to implement a 8086/8088 with the goal of being cycle-exact. I can understand the reasoning behind the number of clock cycles for most instructions, however I must say I'm quite puzzled by the Effective Address (EA) calculation time.
More specifically, why does computing BP + DI or BX + SI take 7 cycles, but computing BP + SI or BX + DI take 8 cycles?
I could just wait for a given number of cycles, but I'm really interested in knowing why there's this 1-cycle difference (and overall why it takes so many cycles to do any EA calculation, since EA uses the ALU for computing addresses, and an ADD between registers is just 3 cycles).
The designers of the chip probably are retired by now, but hopefully there is somebody at Intel who has the knowledge, or can point me to the people who have it :-)
링크가 복사됨
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
I have, there is the same number of bytes in the instruction, whether it's for example BX (5 cycles) or BX + SI (7 cycles) or BX + DI (8 cycles), all of it is encoded using the "mod" + "r/m" fields.
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
Q1 + Q2. These are not measurements, these numbers come from Intel's own reference manual :-)
I have had the answer since then, the difference has to do with how the effective addressing was implemented with the microcode. So effectively, some modes (such as BP + DI) were more optimized than others (like BX + SI).
