The Intel documentation does not specify wether mov R8d , -1 will also zero the high dword of R8, or leave it intact.
The Microsoft Visual C++ (2010) translate the C line a = myfunc(par1, par2, 3) ; into
mov RCX, par1 ; mov RDX, par2 ; mov R8b, 3 ; call myfunc; move qword ptr , RAX
IF the behaviour is implementation-dependent, some processors may crash....
IF the high dword is set to zero when moving to the low dword, why not say it clearly?
链接已复制
>>>IF the behaviour is implementation-dependent, some processors may crash>>>
I suppose that software beign dependent on the register state might crash.I think that when passing an int primitive type high part of 64-bit register will be filled with zeros.
I think, when used with general purpose registers, mov leaves unused parts of the registers intact. The confusion and implementation-specific behavior is present when mov is used with segment registers, which is typically done by OS.
The code you cited may well be correct if the compiler knows that R8 is 0 before the call, so it can save some code size and only load 8 bit lower part of the register.
Thanks, Sergey.
I looked over the code: my definition was actually __int64 (double underscore).
The compiler switches are /O2 /Oi /D "_X64" /D "_MBCS" /FD /EHsc /MD /Gy /FAcs /Fa"x64\Release\\" /Fo"x64\Release\\" /Fd"x64\Release\vc90.pdb" /W3 /nologo /c /Zi /TP /errorReport:prompt
Anyway, my central point was the use of ' mov R8D, 3 ' instead of 'mov R8, 3 ' .
<<The code you cited may well be correct if the compiler knows that R8 is 0 before the call, so it can save some code size and only load 8 bit lower part of the register. >> (andysem)
To test that hipothesis, I called the function twice in succession. VeMulasm may change R8 (it is an assembler function compileded separatedly). The first call could very well changed R8, and the compiler cann't know what an external function does. The included extract shows that,the compiler assumes that mov R8D,3 is equal to mov R8,3 .
typedef double double16 ; // double on 16-bytes boundary
double VecMulasm( double16 *vec, double16 *AV, size_t Ns) ; // B= sum{vec.*AV}
VecMulasm( (double16 *)As, (double16 *)Bs, 3) ; // sum{As.*Bs}
000000013FB915FE lea rdx,[rsp+110h]
000000013FB91606 lea rcx,[rsp+40h]
000000013FB9160B mov r8d,3
000000013FB91611 call VecMulasm (13FB910A5h)
VecMulasm( (double16 *)As, (double16 *)Bs, 3) ; // B= sum{As.*Bs}
000000013FB91616 lea rdx,[rsp+110h]
000000013FB9161E lea rcx,[rsp+40h]
000000013FB91623 mov r8d,3
000000013FB91629 call VecMulasm (13FB910A5h)
R8 is volatile register and is not preserved by compiler and any write to R8d will zero-fill upper half of this register.Interesting why compiler directly addressed lower half of R8 if immediate value Ns = 3 can be represented as 8 bit value.
