- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Since most x87 instructions rely on TOP OF STACK as an implied parameter -- Will it affect out of order execution?
Will switching to scalar SSE code give dramatic performance (like doing a MUL and ADD together etc..)
Thanks,
Will switching to scalar SSE code give dramatic performance (like doing a MUL and ADD together etc..)
Thanks,
1 Solution
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It affects out-of-order execution in the same way as other operand dependencies. Also note that the XCHG instruction is virtually for free (latency 0) since it is handled as a register renaming.
However, x87 by default computes 80-bit results, which for operations like division and square root are very slow. You can explicitly lower that precision to 64- or 32-bit though through the control word.
Also, some compilers are not that good at creating efficient x87 code due to the complications ofmanaging theregister stack. There can generate notably faster scalar SSE code. As far as I'm aware the difference on most reputable compilers is very minor though.
Furthermore on x64 SSE has access to 16 registers which for some algorithms reduces spilling.
But if you really want to improve performance you should probably look into parallelising your code to make full use of SSE (and later AVX).
However, x87 by default computes 80-bit results, which for operations like division and square root are very slow. You can explicitly lower that precision to 64- or 32-bit though through the control word.
Also, some compilers are not that good at creating efficient x87 code due to the complications ofmanaging theregister stack. There can generate notably faster scalar SSE code. As far as I'm aware the difference on most reputable compilers is very minor though.
Furthermore on x64 SSE has access to 16 registers which for some algorithms reduces spilling.
But if you really want to improve performance you should probably look into parallelising your code to make full use of SSE (and later AVX).
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It affects out-of-order execution in the same way as other operand dependencies. Also note that the XCHG instruction is virtually for free (latency 0) since it is handled as a register renaming.
However, x87 by default computes 80-bit results, which for operations like division and square root are very slow. You can explicitly lower that precision to 64- or 32-bit though through the control word.
Also, some compilers are not that good at creating efficient x87 code due to the complications ofmanaging theregister stack. There can generate notably faster scalar SSE code. As far as I'm aware the difference on most reputable compilers is very minor though.
Furthermore on x64 SSE has access to 16 registers which for some algorithms reduces spilling.
But if you really want to improve performance you should probably look into parallelising your code to make full use of SSE (and later AVX).
However, x87 by default computes 80-bit results, which for operations like division and square root are very slow. You can explicitly lower that precision to 64- or 32-bit though through the control word.
Also, some compilers are not that good at creating efficient x87 code due to the complications ofmanaging theregister stack. There can generate notably faster scalar SSE code. As far as I'm aware the difference on most reputable compilers is very minor though.
Furthermore on x64 SSE has access to 16 registers which for some algorithms reduces spilling.
But if you really want to improve performance you should probably look into parallelising your code to make full use of SSE (and later AVX).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the detailed answer! It was useful.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page