OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1718 Discussions

Xeon Phi: HW Exception: Segmentation Fault in all examples

Michael_H_2
Beginner
844 Views

Hey,

I just updated my Phi to the latest MPSS version (3.2.1) and also the OpenCL Runtime (14.1) as well as the SDK (2014 4.4.0).

Since then, every OCL example and code will crash when I let it run on the Phi, the CPU works fine as always.

I tried rebooting and everything I could imagine in my situation but I cannot figure out what is going wrong.

The MonteCarlo Example gives me this output:

Build program options: "-D__DO_FLOAT__ -cl-denorms-are-zero -cl-fast-relaxed-math -cl-single-precision-constant -DNSAMP=262144"
*** OPENCL MIC DEVICE HW EXCEPTION ***: Segmentation fault (Address not mapped to object [0xfffffffffffffff8])

BACKTRACE:
/tmp/coi_procs/1/4991/mic_server[0x407132]
/lib64/libpthread.so.0(+0xf4d0)[0x7f588b47d4d0]
/tmp/coi_procs/1/4991/mic_server[0x41e8dd]
/tmp/coi_procs/1/4991/mic_server[0x4223b8]
/tmp/coi_procs/1/4991/mic_server[0x41fced]
/tmp/coi_procs/1/4991/mic_server[0x41e59d]
/tmp/coi_procs/1/4991/mic_server[0x41672d]
/tmp/coi_procs/1/4991/mic_server(copy_program_to_device+0x21)[0x4165f1]
/usr/lib64/libcoi_device.so.0(+0x31ef0)[0x7f588bd2bef0]
/usr/lib64/libcoi_device.so.0(+0x322c3)[0x7f588bd2c2c3]
/usr/lib64/libcoi_device.so.0(+0x326d9)[0x7f588bd2c6d9]
/lib64/libpthread.so.0(+0x7bce)[0x7f588b475bce]
/lib64/libc.so.6(clone+0x6d)[0x7f588a89d1cd]

******************

terminate called after throwing an instance of 'std::runtime_error'
  what():  Segmentation fault
Segmentation fault

System status for the Phi seems ok:

MicCheck 3.2.1-r1
Copyright 2013 Intel Corporation All Rights Reserved

Executing default tests for host
  Test 0: Check number of devices the OS sees in the system ... pass
  Test 1: Check mic driver is loaded ... pass
  Test 2: Check number of devices driver sees in the system ... pass
  Test 3: Check mpssd daemon is running ... pass
Executing default tests for device: 0
  Test 4 (mic0): Check device is in online state and its postcode is FF ... pass
  Test 5 (mic0): Check ras daemon is available in device ... pass
  Test 6 (mic0): Check running flash version is correct ... pass

Status: OK
MicInfo Utility Log
Copyright 2011-2013 Intel Corporation All Rights Reserved.

Created Fri May 23 16:13:38 2014


	System Info
		HOST OS			: Linux
		OS Version		: 3.0.13-0.27-default
		Driver Version		: 3.2.1-1
		MPSS Version		: 3.2.1
		Host Physical Memory	: 264519 MB

Device No: 0, Device Name: mic0

	Version
		Flash Version 		 : 2.1.02.0390
		SMC Firmware Version	 : 1.16.5078
		SMC Boot Loader Version	 : 1.7.4172
		uOS Version 		 : 2.6.38.8+mpss3.2.1
		Device Serial Number 	 : ADKC25104125

	Board
		Vendor ID 		 : 0x8086
		Device ID 		 : 0x2250
		Subsystem ID 		 : 0x2500
		Coprocessor Stepping ID	 : 3
		PCIe Width 		 : x16
		PCIe Speed 		 : 5 GT/s
		PCIe Max payload size	 : 256 bytes
		PCIe Max read req size	 : 512 bytes
		Coprocessor Model	 : 0x01
		Coprocessor Model Ext	 : 0x00
		Coprocessor Type	 : 0x00
		Coprocessor Family	 : 0x0b
		Coprocessor Family Ext	 : 0x00
		Coprocessor Stepping 	 : B1
		Board SKU 		 : B1PRQ-5110P/5120D
		ECC Mode 		 : Enabled
		SMC HW Revision 	 : Product 225W Passive CS

	Cores
		Total No of Active Cores : 60
		Voltage 		 : 1032000 uV
		Frequency		 : 1052631 kHz

	Thermal
		Fan Speed Control 	 : N/A
		Fan RPM 		 : N/A
		Fan PWM 		 : N/A
		Die Temp		 : 45 C

	GDDR
		GDDR Vendor		 : Elpida
		GDDR Version		 : 0x1
		GDDR Density		 : 2048 Mb
		GDDR Size		 : 7936 MB
		GDDR Technology		 : GDDR5
		GDDR Speed		 : 5.000000 GT/s
		GDDR Frequency		 : 2500000 kHz
		GDDR Voltage		 : 1501000 uV

 

Any advice would be greatly appreciated!

Thanks, Michael

0 Kudos
11 Replies
Alexey_B_Intel1
Employee
844 Views

Hi Michael,

This error message basically says that your application has crashed. This can be caused by many reasons and it's hard to suggest something without looking into the code.

Can you share the source code?

Thanks, Alexey

0 Kudos
Michael_H_2
Beginner
844 Views

Hi,

as I said, it's the MonteCarlo Example from the SDK:

https://software.intel.com/en-us/vcsource/samples/monte-carlo/

But it also happens with every other OCL application I tried. All examples work fine on the CPU.

Best, Michael

0 Kudos
Tim_Mattox
Beginner
844 Views

The release notes for the OpenCL Runtime and the OpenCL SDK have CONFLICTING version requirements for the MPSS, as Michael H. empirically discovered.

In the SDK release notes:

"NOTE: For Intel Xeon Phi coprocessor device support, you must install the 3.2.1 version of Intel MPSS"

In the Runtime release notes:

"NOTE: Using OpenCL Runtime 14.1 with MPSS 3.2.1 is not recommended, as this combination introduces stability issues."

This needs to be resolved for people to use Intel's OpenCL on the Phi with any hope of success. I don't know what to ask my sysadmin to do in this case.

-- Tim

0 Kudos
Michael_H_2
Beginner
844 Views

is there any solution/workaround from Intel in development? I mean like downgrading or so? (even though I can't find the old runtimes and SDK anymore...)

0 Kudos
Ali_S_2
Beginner
844 Views

I'm experiencing the same problem using OpenCL Runtime 14.1 and MPSS 3.2.1.

Does the above release note mean that with the currently available Intel API it's NOT possible to run OpenCL code on Xeon Phi??

0 Kudos
Dmitry_K_Intel
Employee
844 Views

Please do not use MPSS 3.2.1 for OpenCL - it is known not play nice togather. Please roll back to MPSS 3.2 or forward to MPSS 3.2.3 which fixed this inconsistency.

 

0 Kudos
Michael_H_2
Beginner
844 Views

I just updated to 3.2.3, reinstalled OCL runtime and SDK and I'm still experiencing the exact same error.. frustrating..

Will now downgrade to 3.2

0 Kudos
Uri_L_Intel
Employee
844 Views

Hello,

We’ve found a critical issue in the latest release package of the OpenCL runtime for Xeon Phi devices.

We’re currently working to provide a fixed package which will be released soon.

 

We’re truly sorry for the incontinence and will do our best to upload the fixed package as soon as possible.

 

Thanks everyone for the great and important feedbacks,

Uri

0 Kudos
ABoxe
Beginner
844 Views

This leads me to wonder what kind of QA is being done on this SDK.

Kind Regards,

Aaron

0 Kudos
Raghupathi_M_Intel
844 Views

Hi All,

The issue has been fixed and the fixed package can be downloaded. Sorry for any inconvenience.

Thanks,
Raghu

 

0 Kudos
Michael_H_2
Beginner
844 Views

Thank you so much, it works now!

0 Kudos
Reply