Software Archive
Read-only legacy content
17061 Discussions

icpc 15.0.2 internal error

Hans-Christian_S_
944 Views

Hello,

the compiler tells me to contact Intel ;-) I'm a beginner to MIC programming and having a hard time teaching the offload mechanism to accept C++ complex numbers.

[l_stadler_h@merlinx01 complex-test]$ icpc --version
icpc (ICC) 15.0.2 20150121
Copyright (C) 1985-2015 Intel Corporation.  All rights reserved.

[l_stadler_h@merlinx01 complex-test]$ icpc -std=c++11 complex-test.cc                                                                                                                        
": internal error: ** The compiler has encountered an unexpected problem.
** Segmentation violation signal raised. **
Access violation or stack overflow. Please contact Intel Support for assistance.

compilation aborted for complex-test.cc (code 4)

[l_stadler_h@merlinx01 complex-test]$ cat complex-test.cc
#include <iostream>

#pragma offload_attribute(push, _Cilk_shared)
#include <complex>
#pragma offload_attribute(pop)

namespace {
  using std::complex;

  _Cilk_shared complex<float> r;

  _Cilk_shared void product (_Cilk_shared complex<float> &result, _Cilk_shared complex<float> *a, _Cilk_shared complex<float> *b, unsigned int sz)
  {
    using std::conj;

    complex<float> res = complex<float>(.0f, .0f);
    _Cilk_for (unsigned int i=0; i<sz; i++) {
      complex<float> c(b.real(), -b.imag()), d(a.real(), a.imag());
      res += d * c;
    }
    result = res;
  }
}

int main (int argc, char *argv[])
{
  using std::cout;

  constexpr unsigned int sz = 10000;
  _Cilk_shared complex<float> *a = (_Cilk_shared complex<float> *)_Offload_shared_aligned_malloc(sz * sizeof(*a), 64);
  _Cilk_for (unsigned int i=0; i<sz; i++)
    a = complex<float>(1.f/float(i), 1.f/float(i));
  _Cilk_offload product(r, a, a, sz);
  _Offload_shared_aligned_free(a);
  float real = r.real();
  float imag = r.imag();
  complex<float> res(real, imag);
  cout << "Result = " << res << '\n';
  return 0;
}

-----------------------------

micinfo:

MicInfo Utility Log
Created Thu Feb 19 16:41:41 2015


        System Info
                HOST OS                 : Linux
                OS Version              : 3.10.0-123.20.1.el7.x86_64
                Driver Version          : 3.4.2-1
                MPSS Version            : 3.4.2

                Host Physical Memory    : 131753 MB

Device No: 0, Device Name: mic0

        Version
                Flash Version            : 2.1.02.0390
                SMC Firmware Version     : 1.16.5078
                SMC Boot Loader Version  : 1.8.4326
                uOS Version              : 2.6.38.8+mpss3.4.2
                Device Serial Number     : ADKC43600092

        Board
                Vendor ID                : 0x8086
                Device ID                : 0x225c
                Subsystem ID             : 0x7d95
                Coprocessor Stepping ID  : 2
                PCIe Width               : Insufficient Privileges
                PCIe Speed               : Insufficient Privileges
                PCIe Max payload size    : Insufficient Privileges
                PCIe Max read req size   : Insufficient Privileges
                Coprocessor Model        : 0x01
                Coprocessor Model Ext    : 0x00
                Coprocessor Type         : 0x00
                Coprocessor Family       : 0x0b
                Coprocessor Family Ext   : 0x00
                Coprocessor Stepping     : C0
                Board SKU                : C0PRQ-7120 P/A/X/D
                ECC Mode                 : Enabled
                SMC HW Revision          : Product 300W Passive CS

        Cores
                Total No of Active Cores : 61
                Voltage                  : 950000 uV
                Frequency                : 1238095 kHz

        Thermal

              Fan Speed Control        : N/A
                Fan RPM                  : N/A
                Fan PWM                  : N/A
                Die Temp                 : 41 C

        GDDR
                GDDR Vendor              : Samsung
                GDDR Version             : 0x6
                GDDR Density             : 4096 Mb
                GDDR Size                : 15872 MB
                GDDR Technology          : GDDR5
                GDDR Speed               : 5.500000 GT/s
                GDDR Frequency           : 2750000 kHz
                GDDR Voltage             : 1501000 uV

Device No: 1, Device Name: mic1

        Version
                Flash Version            : 2.1.02.0390
                SMC Firmware Version     : 1.16.5078
                SMC Boot Loader Version  : 1.8.4326
                uOS Version              : 2.6.38.8+mpss3.4.2
                Device Serial Number     : ADKC43600046

        Board
                Vendor ID                : 0x8086
                Device ID                : 0x225c
                Subsystem ID             : 0x7d95
                Coprocessor Stepping ID  : 2
                PCIe Width               : Insufficient Privileges
                PCIe Speed               : Insufficient Privileges
                PCIe Max payload size    : Insufficient Privileges
                PCIe Max read req size   : Insufficient Privileges
                Coprocessor Model        : 0x01
                Coprocessor Model Ext    : 0x00
                Coprocessor Type         : 0x00
                Coprocessor Family       : 0x0b
                Coprocessor Family Ext   : 0x00
                Coprocessor Stepping     : C0
                Board SKU                : C0PRQ-7120 P/A/X/D
                ECC Mode                 : Enabled
                SMC HW Revision          : Product 300W Passive CS

        Cores
                Total No of Active Cores : 61
                Voltage                  : 1001000 uV

               Frequency                : 1238095 kHz

        Thermal
                Fan Speed Control        : N/A
                Fan RPM                  : N/A
                Fan PWM                  : N/A
                Die Temp                 : 43 C

        GDDR
                GDDR Vendor              : Samsung
                GDDR Version             : 0x6
                GDDR Density             : 4096 Mb
                GDDR Size                : 15872 MB
                GDDR Technology          : GDDR5
                GDDR Speed               : 5.500000 GT/s
                GDDR Frequency           : 2750000 kHz
                GDDR Voltage             : 1501000 uV

 

[l_stadler_h@merlinx01 complex-test]$ lsb_release -d
Description:    CentOS Linux release 7.0.1406 (Core) (Maipo)
 

So, any hints on how C++ complex numbers and arrays of these are tranferred easily from host to mic and back?

 

0 Kudos
3 Replies
Kevin_D_Intel
Employee
944 Views

I reproduced the internal error and will forward the details to development (see internal tracking id below) and keep the post updated about progress on a fix and any work around.

Regarding hints on offloading complex data, was there a particular interest in the Virtual Shared model you tried using here?

(Internal tracking id: DPD200366703)

0 Kudos
Hans-Christian_S_
944 Views

Hello Kevin,

no special interest, just trying to find a method how to transfer arrays of complex numbers to the mic, doing the calc there and transfer the resulting arrays of complex numbers back using SOME offload mechanism.

The only method that worked for me so far is the simple #pragma offload, and telling the compiler to ignore errors about not bitwise copyable data (option -wd2568), which seems not to be a great solution.

I would prefer OpenMP 4.0, since I'm used to that, but so far I was not able to use this method sucessfully, because the compiler reports strange linking errors on the attached example:

[l_stadler_h@merlinx01 matrix-test]$ icpc -DUSE_OFFLOAD -std=c++11 -mkl -finline-functions -fno-exceptions -fno-alias -qopenmp -Ofast -debug all mat-test.cpp -o matrix-test-off
/tmp/icpcurmzUU.o: In function `L__ZN12_GLOBAL__N_18mat_testIdEEijj_76__par_loop1_2_0':
/nfs/home/l_stadler_h/matrix-test/mat-test.cc:79: undefined reference to `double std::norm<double>(std::complex<double> const&)'
/nfs/home/l_stadler_h/matrix-test/mat-test.cc:79: undefined reference to `std::complex<double> std::operator*<double>(std::complex<double> const&, double const&)'
/tmp/icpcurmzUU.o: In function `L__ZN12_GLOBAL__N_18mat_testIdEEijj_67__par_region0_2_1':
/nfs/home/l_stadler_h/matrix-test/mat-test.cc:91: undefined reference to `char* std::copy<__gnu_cxx::__normal_iterator<char*, std::string>, char*>(__gnu_cxx::__normal_iterator<char*, std::string>, __gnu_cxx::__normal_iterator<char*, std::string>, char*)'
/tmp/icpcurmzUU.o: In function `L__ZN12_GLOBAL__N_18mat_testIfEEijj_76__par_loop1_2_6':
/nfs/home/l_stadler_h/matrix-test/mat-test.cc:79: undefined reference to `float std::norm<float>(std::complex<float> const&)'
/nfs/home/l_stadler_h/matrix-test/mat-test.cc:79: undefined reference to `std::sqrt(float)'
/nfs/home/l_stadler_h/matrix-test/mat-test.cc:79: undefined reference to `std::complex<float> std::operator*<float>(std::complex<float> const&, float const&)'
/tmp/icpcurmzUU.o: In function `L__ZN12_GLOBAL__N_18mat_testIfEEijj_67__par_region0_2_7':
/nfs/home/l_stadler_h/matrix-test/mat-test.cc:91: undefined reference to `char* std::copy<__gnu_cxx::__normal_iterator<char*, std::string>, char*>(__gnu_cxx::__normal_iterator<char*, std::string>, __gnu_cxx::__normal_iterator<char*, std::string>, char*)'

And my attempt at using Cilk_offload failed even more miserably.

So the situation looks quite miserable at the moment, especially when taking into account that all the simple benchmarks I did so far with complex math compiled natively for mic perform worse than on the CPUs.

Maybe intrinsics will help, I hope.

 

0 Kudos
Kevin_D_Intel
Employee
944 Views

Sorry about all this misery. I too tried the data marshaling model with your earlier test case and met with a non-bitwise copyable error for variable "r". I will look at your mat-test.cpp and consult w/Development as necessary about any possible OpenMP 4.0 solution. Stay tuned.

0 Kudos
Reply