- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Using lea Instructions:
sal edx,3
lea edx, [ecx + ecx]
add edx, edx
add edx, edx
XOR EDX,EDX
SHRD EDX,ECX,29
has the same issues as first fragment, would not
LEA EDX,[ECX*8] (seven bytes)
or, if you have another spare zero register, say EBX, a denser encoding would be
LEA EDX,[EBX+ECX*8] (three bytes)
Would these not be faster, with the further advantage of changing no flags? /Roy Sykes
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Message Edited by intel.software.network.support on 12-09-2005 10:48 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The reason the combination of lea and add is faster because it gets away from the shifting instruction.
Lets look at the statement:
LEA EDX,[ECX*8] (seven bytes)
This operation involves multiplication (shifting) which is known to be slowed.
The following statement is even worse:
LEA EDX,[EBX+ECX*8] (three bytes)
This operation also involves multiplication and another operation to set the register EBX to zero.
Hope this helps!
==
Regards,
Lexi S.
IntelSoftware NetworkSupport
Message Edited by intel.software.network.support on 12-02-2005 08:50 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page