Software Archive
Read-only legacy content
17061 Discussions

Using LEA instructions

roysykes
Beginner
958 Views
In his useful paper "Assembly Language Tips & Tricks for the Intel Pentium 4 Processor", Khang Nguyen suggested the following:
  • Using lea Instructions:
mov edx,ecx
sal edx,3
Faster:

lea edx, [ecx + ecx]
add edx, edx
add edx, edx

While I suppose

XOR EDX,EDX

SHRD EDX,ECX,29

has the same issues as first fragment, would not

LEA EDX,[ECX*8] (seven bytes)

or, if you have another spare zero register, say EBX, a denser encoding would be

LEA EDX,[EBX+ECX*8] (three bytes)

Would these not be faster, with the further advantage of changing no flags? /Roy Sykes

0 Kudos
3 Replies
Intel_Software_Netw1
958 Views
We will forward your question to the author and let you know what response we receive.
For those following along, here is a link to the article.
Regards,

Lexi S.

IntelSoftware NetworkSupport

http://www.intel.com/software

Contact us

Message Edited by intel.software.network.support on 12-09-2005 10:48 AM

0 Kudos
Intel_Software_Netw1
958 Views
The author responded as follows:

The reason the combination of lea and add is faster because it gets away from the shifting instruction.

Lets look at the statement:

LEA EDX,[ECX*8] (seven bytes)

This operation involves multiplication (shifting) which is known to be slowed.

The following statement is even worse:

LEA EDX,[EBX+ECX*8] (three bytes)

This operation also involves multiplication and another operation to set the register EBX to zero.

Hope this helps!

==

Regards,

Lexi S.

IntelSoftware NetworkSupport

http://www.intel.com/software

Contact us

Message Edited by intel.software.network.support on 12-02-2005 08:50 PM

0 Kudos
jim_dempsey
Beginner
958 Views
This is for Lexi
The "LEA EDX,[ECX*8]" instructionalthough appears text wise to involve multiplication (*) it doesnot. Also, this does not involve a shift operation. If either were true then your processor wizards would need to go back to school. Simple shift-like operations for *1, *2, *4, *8 are so common that are hardwired into the archetecture. All permutations arealways presenta multiplexer selects the desired result.
Jim Dempsey
0 Kudos
Reply