Is the problem associated

Armando_Lazaro_Alami · ‎03-13-2014

Hi.

I recently updated Intel compiler to version 14.0.2.176 [IA-32] on Windows. One of the projects started to generate runtime errors. After debugging, I isolated the problem to a memcpy "call". The relevant part of the code is:

float mat[3][3];
float mataux[3][3] ;

....

memcpy(mataux, mat,sizeof(mataux));

...

That routine have not changed for years. If I compile with the previous version of the compiler, or if compile for debug, or if I change to memmove() or if I copy the matrix elements with a loop, the problem vanished. Also, if other compiler is used, the problem vanished too.

In fact, the compiler is not calling the __intel_fast_memcpy , it is generating an inlined version of memcpy(). See the ASM listing of the generated code :

;;;
;;; memcpy(mataux, mat,sizeof(mataux));

        movaps    xmm1, XMMWORD PTR [100+esp]                   ;1460.3
$LN4593:
        movaps    XMMWORD PTR [148+esp], xmm1                   ;1460.3
$LN4594:
        movaps    xmm5, XMMWORD PTR [116+esp]                   ;1460.3
$LN4595:
        movaps    XMMWORD PTR [164+esp], xmm5                   ;1460.3
$LN4596:
        mov       edi, DWORD PTR [132+esp]                      ;1460.3
$LN4597:
        mov       DWORD PTR [180+esp], edi                      ;1460.3
$LN4598:

In other parts of the code where memcpy is generating the call to __intel_fast_memcpy, there is no problem.

Additional information:

Compiler switches :

/c /O3 /Ob2 /Oi /Ot /Oy /Qip /GA /D "WIN32" /D "NDEBUG" /D "_WINDOWS" /D "MYSDL_MEVIS" /D "___MNPS_PROJECT___" /D "_MBCS" /MT /Zp1 /GS- /fp:fast /J /GR- /Fo"Release/" /Fd"Release/vc90.pdb" /W3 /nologo /Zi /TC /Qopenmp /QxSSE3 /Qopt-matmul- /Quse-intel-optimized-headers /fp:double /Qstd=c99 /Qrestrict /Qdiag-disable:1786,2557

Running on Windows 7 or 8 or Vista. All processors are Intel I7 of several generations.

Thanks.

Armando

Bernard · ‎03-14-2014

>>>I recently updated Intel compiler to version 14.0.2.176 [IA-32] on Windows. One of the projects started to generate runtime errors>>>

Can you provide more description regarding those runtime errors?I presume that you are seeing access violation errors.

TimP · ‎03-14-2014

Is the problem associated with /fp:double or the setting of conflicting /fp: options?

Armando_Lazaro_Alami · ‎03-14-2014

The thread 'Win32 Thread' (0x3a20) has exited with code 0 (0x0).
Unhandled exception at 0x77d715de in MNPS.exe: 0xC0000005: Access violation reading location 0x00000000.
The program '[13668] MNPS.exe: Native' has exited with code 0 (0x0).

Debugger goes to module tidtable.c and stoped in :

_CRTIMP PFLS_GETVALUE_FUNCTION __cdecl __set_flsgetvalue()
{
#ifdef _M_IX86
    PFLS_GETVALUE_FUNCTION flsGetValue = FLS_GETVALUE;   <= here
    if (!flsGetValue)
    {
        flsGetValue = _decode_pointer(gpFlsGetValue);
        TlsSetValue(__getvalueindex, flsGetValue);
    }
    return flsGetValue;
#else /* _M_IX86 */
    return NULL;
#endif /* _M_IX86 */
}

Armando_Lazaro_Alami · ‎03-14-2014

Tim Prince wrote:

Is the problem associated with /fp:double or the setting of conflicting /fp: options?

Is there any thing wrong in using FP model as FAST and FP expression evaluation as DOUBLE ?

Or I missed something else ?

Bernard · ‎03-14-2014

>>>PFLS_GETVALUE_FUNCTION flsGetValue = FLS_GETVALUE; <= here>>.

I suppose that probably flsGetValue pointer is initialized to null.

Bernard · ‎03-14-2014

Can you post call stack?

Btw, it does not look like the problem is directly related to memcpy().

Armando_Lazaro_Alami · ‎03-14-2014

iliyapolak wrote:

>>>PFLS_GETVALUE_FUNCTION flsGetValue = FLS_GETVALUE; <= here>>.

I suppose that probably flsGetValue pointer is initialized to null.

But this is a system runtime file, I do not know what it does.

TimP · ‎03-14-2014

Armando Lazaro Alaminos Bouza wrote:

Quote:

Tim Prince wrote:
Is the problem associated with /fp:double or the setting of conflicting /fp: options?

Is there any thing wrong in using FP model as FAST and FP expression evaluation as DOUBLE ?

Or I missed something else ?

Although you might argue that the last /fp: option should cause earlier ones to be ignored (and not doing so would be a bug), I doubt this gets full testing. Your comment seems to indicate that you expect the compiler to choose certain features of each option. If you want /fp:double with some of the features of /fp:fast, in my opinion, you should specify such features explicitly after setting /fp:double, e.g. /Qftz /Qprec-div- As far as I know, you will not be able to get the effect of prec-div- prec-sqrt- in the same code expansion as double evaluation of float expressions.

In one context (float complex divide/sqrt/abs) the default /fp:fast=1 should do the same thing as /fp:double. In other respects, those options are contradictory.

It's not clear that this has an impact on your issue with in-line replacement of memcpy, or, if it does, whether that would be considered a compiler bug, but having too many options, not tested in combination, could confuse the issue.

Bernard · ‎03-14-2014

>>>But this is a system runtime file, I do not know what it does.>>>

That file could be probably related to somehow to Fiber functions http://msdn.microsoft.com/en-us/library/windows/desktop/ms683141(v=vs.85).aspx

problem with memcpy()