OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1719 Discussions

OPENCL MIC DEVICE HW EXCEPTION error

Fernando_G_3
Beginner
571 Views

Hi, 

I am trying to execute a piece of code in an intel Xeon Phi accelerator without any success. I am obtaining this error a lot of times: *** OPENCL MIC DEVICE HW EXCEPTION ***: Segmentation fault (Address not mapped to object [0x7fcafd726640]) (the mapped adrress is different every time). I am pretty sure that the cl_variables have the correct bounds, i.e., I am not writing out of bounds and the memory should not have been corrupted. 

Besides, I have two Xeon Phi accelerators mounted in the same host, but the openCL driver recognizes just one of them. Is there anything I have missed?

I have to clarify that this very same code runs without troubles using the openCL driver from nVidia on nVidia GPUs.

Thanks in advance.

0 Kudos
7 Replies
Yuri_K_Intel
Employee
571 Views
Hi Fernando, Is it possible to attach a minimal reproducer for the segfault issue? As for the 2 installed daccelerators and 1 detected device - I will get back to you once I clarify it. Thanks, Yuri
0 Kudos
Chuck_De_Sylva
Beginner
571 Views

Fernando,

Can you list the Device ID of your part? Also the PCI config space subsystem vendor ID and revision ID would be useful. Just so we know what part we are dealing with.

- Chuck

0 Kudos
Yuri_K_Intel
Employee
571 Views

An update regarding OpenCL support of several Xeon Phi devices. Current release (XE 2013 Beta) supports only 1 device. Multiple devices will be supported in the next release later this year.

Thanks,

Yuri

0 Kudos
Fernando_G_3
Beginner
571 Views

Hi all, 

I list here the results obtained when executing /opt/intel/mic/bin/micinfo:

MicInfo Utility Log

Created Tue Jan 15 11:44:02 2013


System Info
Host OS : Linux
OS Version : 2.6.32-279.19.1.el6.x86_64
Driver Version : 4346-16
MPSS Version : 2.1.4346-16
Host Physical Memory : 65886 MB
CPU Family : GenuineIntel Family 6 Model 45 Stepping 7
CPU Speed : 1200.000
Threads per Core : 2


Device No: 0, Device Name: Intel(R) Xeon Phi(TM) coprocessor

Version
Flash Version : 2.1.01.0375
UOS Version : 2.6.34.11-g65c0cd9
Device Serial Number : ADKC23000122

Board
Vendor ID : 8086
Device ID : 225d
SubSystem ID : 2500
MIC Processor Stepping ID : 1
PCIe Width : x16
PCIe Speed : 5 GT/s
PCIe Max payload size : 256 bytes
PCIe Max read req size : 4096 bytes
MIC Processor Model : 0x01
MIC Processor Model Ext : 0x00
MIC Processor Type : 0x00
MIC Processor Family : 0x0b
MIC Processor Family Ext : 0x00
MIC Silicon Stepping : B0
Board SKU : ES2-P1330
ECC Mode : Enabled
SMC HW Revision : Product 300W Active CS

Core
Total No of Active Cores: 57
Voltage : 1049000 uV
Frequency : 1100000 kHz

Thermal
Fan Speed Control : On
SMC Firmware Version : 1.6.3983
FSC Strap : 14 MHz
Fan RPM : 2700
Fan PWM : 50
Die Temp : 55 C

GDDR
GDDR Vendor : Hynix
GDDR Version : 0x3
GDDR Density : 2048 Mb
GDDR Size : 5952 MB
GDDR Technology : GDDR5
GDDR Speed : 5.000000 GT/s
GDDR Frequency : 2500000 kHz
GDDR Voltage : 1000000 uV

Device No: 1, Device Name: Intel(R) Xeon Phi(TM) coprocessor

Version
Flash Version : 2.1.01.0375
UOS Version : 2.6.34.11-g65c0cd9
Device Serial Number : ADKC22900378

Board
Vendor ID : 8086
Device ID : 225d
SubSystem ID : 2500
MIC Processor Stepping ID : 1
PCIe Width : x16
PCIe Speed : 5 GT/s
PCIe Max payload size : 256 bytes
PCIe Max read req size : 4096 bytes
MIC Processor Model : 0x01
MIC Processor Model Ext : 0x00
MIC Processor Type : 0x00
MIC Processor Family : 0x0b
MIC Processor Family Ext : 0x00
MIC Silicon Stepping : B0
Board SKU : ES2-P1330
ECC Mode : Enabled
SMC HW Revision : Product 300W Active CS

Core
Total No of Active Cores: 57
Voltage : 1042000 uV
Frequency : 1100000 kHz

Thermal
Fan Speed Control : On
SMC Firmware Version : 1.6.3983
FSC Strap : 14 MHz
Fan RPM : 2700
Fan PWM : 50
Die Temp : 53 C

GDDR
GDDR Vendor : Hynix
GDDR Version : 0x3
GDDR Density : 2048 Mb
GDDR Size : 5952 MB
GDDR Technology : GDDR5
GDDR Speed : 5.000000 GT/s
GDDR Frequency : 2500000 kHz
GDDR Voltage : 1000000 uV

As to the segmentation fault issue, please find attached a minimal code that reproduces the issue. The calGrad.cpp can be compiled in two ways (I assume you are under linux):

c++ -o gradTest calGrad.cpp -lOpenCL

with this compilation the calGrad.cl file containing the crashing kernel is used.

c++ -o gradTest -D__CALGRAD2__ calGrad.cpp -lOpenCL

with this compilation the calGrad2.cl file containing the kernel that does not crash is used.

The difference between them is that in the former the cl_ngroup variable is updated inside the kernel by the work item  get_global_size(0)-1, while in the latter the cl_ngroup variable is written outside the kernel in the cpp file. The first one crashes producing one of these messages referred in the first post.

Thanks for your help.

0 Kudos
Yuri_K_Intel
Employee
571 Views
Fernando, Thank you for the code. I was able to reproduce the behaviour. I will get back to you after initial investigation. Thanks, Yuri
0 Kudos
Yuri_K_Intel
Employee
571 Views
Hi Fernando, Sorry, it took so long to answer. It looks like this is the kernel issue. When the last work-item modifies cl_ngroup[0] at line 63, it doesn't necessarily mean that all work-items have finished their execution at this point. The work-items execute in parallel, so the above change might affect the other work-items that only started their execution and use this value at line 14. Thanks, Yuri
0 Kudos
Fernando_G_3
Beginner
571 Views

Hi Yuri,

Thanks for the answer, I'll keep that behavior in mind when executing kernels in Intel accelerators.

0 Kudos
Reply