64-bit bug in Visual C++? mov R8d,imm not completley defined

zalia64 · ‎05-07-2013

The Intel documentation does not specify wether mov R8d , -1 will also zero the high dword of R8, or leave it intact.

The Microsoft Visual C++ (2010) translate the C line a = myfunc(par1, par2, 3) ; into

mov RCX, par1 ; mov RDX, par2 ; mov R8b, 3 ; call myfunc; move qword ptr , RAX

IF the behaviour is implementation-dependent, some processors may crash....

IF the high dword is set to zero when moving to the low dword, why not say it clearly?

zalia64 · ‎05-07-2013

Appendum: The function myfunc is defined as _int64 myfunc( int *par1 , int *par2 , size_t N ) ;

Bernard · ‎05-08-2013

>>>IF the behaviour is implementation-dependent, some processors may crash>>>

I suppose that software beign dependent on the register state might crash.I think that when passing an int primitive type high part of 64-bit register will be filled with zeros.

andysem · ‎05-08-2013

I think, when used with general purpose registers, mov leaves unused parts of the registers intact. The confusion and implementation-specific behavior is present when mov is used with segment registers, which is typically done by OS.

The code you cited may well be correct if the compiler knows that R8 is 0 before the call, so it can save some code size and only load 8 bit lower part of the register.

SergeyKostrov · ‎05-08-2013

>>...The function myfunc is defined as _int64 myfunc( int *par1 , int *par2 , size_t N ) ; Please verify that a built-in declaration _int64 for a 64-bit integer type is supported in VS 2010. I know that __int64 is fully supported for a long time and I'm not sure that _int64 is a right declaration. I'll verify it as well some time later.

zalia64 · ‎05-08-2013

R8 is a scratch register. Its contents may be changed by any function call, at will. So I would consider relying on its previous state a very dongerous approach.

Bernard · ‎05-08-2013

Is R8 not supposed to be filled by cpu with zeroes when 32-bit value is beign written to it?

SergeyKostrov · ‎05-08-2013

I completed a simple test. >>...I know that __int64 is fully supported for a long time and I'm not sure that _int64 is a right declaration... This is simply to confirm that Intel and Microsoft C++ compilers support _int64 built-in type. What command line options of Intel C++ compiler did you use?

zalia64 · ‎05-08-2013

Thanks, Sergey.

I looked over the code: my definition was actually __int64 (double underscore).

The compiler switches are /O2 /Oi /D "_X64" /D "_MBCS" /FD /EHsc /MD /Gy /FAcs /Fa"x64\Release\\" /Fo"x64\Release\\" /Fd"x64\Release\vc90.pdb" /W3 /nologo /c /Zi /TP /errorReport:prompt

Anyway, my central point was the use of ' mov R8D, 3 ' instead of 'mov R8, 3 ' .

<<The code you cited may well be correct if the compiler knows that R8 is 0 before the call, so it can save some code size and only load 8 bit lower part of the register. >> (andysem)

To test that hipothesis, I called the function twice in succession. VeMulasm may change R8 (it is an assembler function compileded separatedly). The first call could very well changed R8, and the compiler cann't know what an external function does. The included extract shows that,the compiler assumes that mov R8D,3 is equal to mov R8,3 .

typedef double double16 ; // double on 16-bytes boundary

double VecMulasm( double16 *vec, double16 *AV, size_t Ns) ; // B= sum{vec.*AV}

        VecMulasm( (double16 *)As, (double16 *)Bs, 3) ;   // sum{As.*Bs}
000000013FB915FE lea         rdx,[rsp+110h]
000000013FB91606 lea         rcx,[rsp+40h]
000000013FB9160B mov         r8d,3
000000013FB91611 call        VecMulasm (13FB910A5h)
       VecMulasm( (double16 *)As, (double16 *)Bs, 3) ;   // B= sum{As.*Bs}
000000013FB91616 lea         rdx,[rsp+110h]
000000013FB9161E lea         rcx,[rsp+40h]
000000013FB91623 mov         r8d,3
000000013FB91629 call        VecMulasm (13FB910A5h)

Bernard · ‎05-09-2013

R8 is volatile register and is not preserved by compiler and any write to R8d will zero-fill upper half of this register.Interesting why compiler directly addressed lower half of R8 if immediate value Ns = 3 can be represented as 8 bit value.

Bernard · ‎05-09-2013

Sorry I made a mistake argument size_t Ns is 32-bit int so compiler addressed lower part of R8.

SergeyKostrov · ‎05-09-2013

It gets confusing because the user didn't specify clearly if a 64-bit platform is used. However, a title of the thread says: '64-bit bug...' >>... I made a mistake argument size_t Ns is 32-bit int so compiler addressed lower part of R8... size_t is a 32-bit ( 4 bytes ) on 32-bit platforms size_t is a 64-bit ( 8 bytes ) on 64-bit platforms

Bernard · ‎05-09-2013

Yep platform is 64-bit because of 64-bit registers beign used.