lea behaviour

Vladimir_Dergachev · ‎05-24-2013

What are the sideeffects of LEA instruction ?

I am seeing a fairly large number of CPU cycles attributed to LEA, which is used to compute &ARRAY[base+i*stride+j]. Does the computed address cause a cache fault ?

thank you

Vladimir Dergachev

James_C_Intel2 · ‎05-28-2013

lea cannot cause a cache miss, because it does not perform a memory access. It is just doing arithmetic on registers. If you are seeing a lot of time attributed to lea operations, then it is probably caused by dependency stalls in the in-order pipe (so cases where the lea is using as input a register whose value was modified in the previous instruction).

Vladimir_Dergachev · ‎05-28-2013

Ahh, thank you ! I was not sure how much of mov functionality was retained. Is the latency information available anywhere ? I'd like to understand where the stalls are coming from - in some cases lea had a lot of time attributed to it, even though neighbouring instructions with similar dependency levels had much lower time.

Could this be due to difficulty sharing lea hardware between four threads ?

TaylorIoTKidd · ‎08-26-2013

The best tool to identify such stalls is VTune (http://software.intel.com/en-us/intel-vtune-amplifier-xe) or a simliar tool.

Regards
--
Taylor