Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

Bug with ICC?

Adam_G_2
Beginner
1,218 Views

When trying to build our open source project;
https://github.com/deeplearning4j/libnd4j

 

We are seeing a compilation error similar to:
https://gist.github.com/treo/d0b7610f9072f18449b600a0d585dad4

The full error report is here:

https://github.com/deeplearning4j/libnd4j/issues/280

 

This is with the new knight's landing beta.

 

Thanks!

0 Kudos
8 Replies
Judith_W_Intel
Employee
1,218 Views

 

I tried using the zip file at the github site and am seeing these (look like valid) compilation errors. Is there something wrong with the zip file or the include/op_boilerplate.h?

I see similar errors with g++ (after I change -qopenmp to -fopenmp).

cd /home/nsl/jward4/d/BUGS/FORUM6/libnd4j-master/blasbuild/cpu/blas && /home/nsl/jward4/d/workspaces/cfe/dev/build_objs/efi2linux_debug/bin/icpc   -D__CPUBLAS__=true -Dnd4j_EXPORTS -I/home/nsl/jward4/d/BUGS/FORUM6/libnd4j-master/include -I/usr/local/include  -Wall -O3 -Wl,-rpath,RIGIN/ -O3 -ffast-math -ftree-vectorize -ftree-vectorizer-verbose=2 -fopt-info-vec -fopt-info-vec-missed -qopenmp -Wall -O3 -std=c++11 -fassociative-math -funsafe-math-optimizations -fPIC   -o CMakeFiles/nd4j.dir/cpu/NativeOps.cpp.o -c /home/nsl/jward4/d/BUGS/FORUM6/libnd4j-master/blas/cpu/NativeOps.cpp
icpc: command line warning #10006: ignoring unknown option '-ftree-vectorizer-verbose=2'
icpc: command line warning #10006: ignoring unknown option '-fopt-info-vec'
icpc: command line warning #10006: ignoring unknown option '-fopt-info-vec-missed'
icpc: command line warning #10006: ignoring unknown option '-fassociative-math'
icpc: command line warning #10006: ignoring unknown option '-funsafe-math-optimizations'
icpc: warning #10193: -vec is default; use -x and -ax to configure vectorization
In file included from /home/nsl/jward4/d/BUGS/FORUM6/libnd4j-master/include/broadcasting.h(17),
                 from /home/nsl/jward4/d/BUGS/FORUM6/libnd4j-master/blas/cpu/../NativeOpExcutioner.h(8),
                 from /home/nsl/jward4/d/BUGS/FORUM6/libnd4j-master/blas/cpu/NativeOps.cpp(6):
/home/nsl/jward4/d/BUGS/FORUM6/libnd4j-master/include/op_boilerplate.h(471): error: the #endif for this directive is missing
  #ifdef __clang__
   ^

/home/nsl/jward4/d/BUGS/FORUM6/libnd4j-master/include/indexreduce.h(372): error: argument list for class template "simdOps::IndexMax" is missing
                      RETURNING_DISPATCH_BY_OPNUM(execScalar, PARAMS(x, xShapeInfo, extraParams), INDEX_REDUCE_OPS);
                                                                                                  ^
          detected during instantiation of "T NativeOpExcutioner<T>::execIndexReduceScalar(int, T *, int *, T *) [with T=double]" at line 34 of "/home/nsl/jward4/d/BUGS/FORUM6/libnd4j-master/blas/cpu/NativeOps.cpp"

/home/nsl/jward4/d/BUGS/FORUM6/libnd4j-master/include/indexreduce.h(372): error: identifier "PARAMS" is undefined
                      RETURNING_DISPATCH_BY_OPNUM(execScalar, PARAMS(x, xShapeInfo, extraParams), INDEX_REDUCE_OPS);
                                                              ^
          detected during:
            instantiation of "T functions::indexreduce::IndexReduce<T>::execScalar(int, T *, int *, T *) [with T=double]" at line 37 of "/home/nsl/jward4/d/BUGS/FORUM6/libnd4j-master/blas/cpu/../NativeOpExcutioner.h"
            instantiation of "T NativeOpExcutioner<T>::execIndexReduceScalar(int, T *, int *, T *) [with T=double]" at line 34 of "/home/nsl/jward4/d/BUGS/FORUM6/libnd4j-master/blas/cpu/NativeOps.cpp"'

...

 

Judy

0 Kudos
Adam_G_2
Beginner
1,218 Views

Did you make sure to check out the right branch?

 

Download the zip from here:

https://github.com/deeplearning4j/libnd4j/tree/icc_compilation

 

0 Kudos
Judith_W_Intel
Employee
1,218 Views

 

Ok thanks I can reproduce it now, I'm seeing an internal error in our optimizer code, i.e.:

...

/home/nsl/jward4/d/BUGS/FORUM6/libnd4j-icc_compilation/include/summarystatsreduce.h(613): warning #1011: missing return statement at end of non-void function "functions::summarystats::SummaryStatsReduce<T>::execScalar(int, bool, T *, int *, T *) [with T=float]"
        }
        ^
          detected during:
            instantiation of "T functions::summarystats::SummaryStatsReduce<T>::execScalar(int, bool, T *, int *, T *) [with T=float]" at line 401 of "/home/nsl/jward4/d/BUGS/FORUM6/libnd4j-icc_compilation/blas/cpu/../NativeOpExcutioner.h"
            instantiation of "T NativeOpExcutioner<T>::execSummaryStatsScalar(int, T *, int *, T *, bool) [with T=float]" at line 1346 of "/home/nsl/jward4/d/BUGS/FORUM6/libnd4j-icc_compilation/blas/cpu/NativeOps.cpp"

": internal error: #20000_3471: epair && enode (shared/hpo/hpo_vector_avr.c, line 3471)

I'll try to reduce it to a small example and see if I can come up with a workaround and also file a bug report. Stay tuned.

Judy

0 Kudos
Judith_W_Intel
Employee
1,218 Views

 

This is a small reproducer, the compiler crashes if you compile this with icpc -c -O3 -fopenmp bug.cpp:

struct IndexValue {
  int value;
  unsigned int index;
};

IndexValue update(IndexValue o, IndexValue old)
{
   if (o.value > old.value)
      return o;
   return old;
}

int foo() {
  IndexValue curr;
#pragma omp simd
  for (int i = 0; i < 3; i++) {
     curr = update(curr, curr);
  }
  return  curr.index;
}

I have submitted an internal bug defect (DPD200414099) for this problem.

As far as workarounds possibilities are:

(1) Disable high level optimization when compiling this file (i.e. compile with -O1 or lower)

(2) The bug seems to be triggered by the pragma omd simd's on line 446
and line 646 of indexreduce.h, i.e. in particular the assignment which
uses the variable startingIndex (and indexValue) both on the left hand and right hand side of the
last assignment statement:

#pragma omp simd
                                                        for (Nd4jIndex i = 0;
i < length; i++) {
                                                                IndexValue<T>
curr;
                                                                curr.value =
x;
                                                                curr.index =
i;
                                                                startingIndex
= OpType::update(startingIndex, curr,


So another workaround is to disable the two omp simds in this header file.

Thanks for reporting this and sorry for the inconvenience.

Judy

0 Kudos
Serge_P_
Beginner
1,218 Views

While compiler should never crash, the code causing the crash is also incorrect. The code contains cross-iteration dependency (reduction-like update of "startingIndex") which is not enlisted as reduction. So removing #pragma omp simd is not just workaround, it will make the code correct.

0 Kudos
SergeyKostrov
Valued Contributor II
1,218 Views
>>icpc: command line warning #10006: ignoring unknown option '-ftree-vectorizer-verbose=2' >>icpc: command line warning #10006: ignoring unknown option '-fopt-info-vec' >>icpc: command line warning #10006: ignoring unknown option '-fopt-info-vec-missed' >>icpc: command line warning #10006: ignoring unknown option '-fassociative-math' >>icpc: command line warning #10006: ignoring unknown option '-funsafe-math-optimizations' Intel C++ compiler does Not support GCC compiler command line options.
0 Kudos
Serge_P_
Beginner
1,218 Views

Optimization reports in Intel Compiler are controlled using -qopt-report set of options: -qopt-report=<level> where highest level is 5, -qopt-report-phase=<phases_list>, for vectorization report use 'vec', reports are emitted to files <obj_name>.optrpt, you may use -qopt-report-file=stderr to redirect output to terminal window. 

The latter 2 options are controlled via -fp-model switch in Intel Compiler and first one is enabled by default, while latter is too compiler-specific: Intel Compiler and gcc implement different set of optimizations. So it is hard to tell how -fp-model=fast and -fp-model=fast2 map to -funsafe-math-optimization.

0 Kudos
TimP
Honored Contributor III
1,218 Views

associative-math optimizations are included in -fp-model fast[=1], as are many of the icc unsafe-math optimizations.

fast=2 adds complex-limited-range and the domain shortcuts for divide and sqrt, which I don't think have counterparts in gcc.  Those options are available separately,

As Judy mentioned, specification of omp  without declaring reductions (or firstprivate where needed) is likely to produce wrong results without warning, although it shouldn't crash the compiler.

0 Kudos
Reply