Xeon PHI in terms of OpenCL Platform Model

javiroman · ‎02-08-2013

Hello!

I'm starting to study OpenCL behind Xeon PHI. I'm triying to split the Xeon PHI into the OpenCL platform model.

OpenCL splits the platform in Devices, Compute Units and Processing Elements (PEs).

For NVIDA GPU devices there are a lot of documentation about this topic. For example, taking into account a Tesla M2090 GPU:

1. OpenCL show MAX_COMPUTE_UNITS = 16

2. In terms of NVIDA a Steaming Processor (SM) is equivalent to OpenCL Compute Unit.

3. A Streaming Processor (SM) for Fermi achitecture has 32 Streaming Procesors (SPs), and this is equivalent to a OpenCL Processing Element (PE). CUDA SP = OpenCL PE.

4. The CUDA SP executes a CUDA Thread, and a CUDA Thread execute an OpenCL work-item (CUDA kernel).

So the Tesla M2090 in terms of OpenCL:

Compute Units = 16

Processing Elements = 16 SMs x 32 SPs = 512 (this number is equivalent in terms of NVIDA: 512 CUDA cores).

The question is, is available any documentation about this OpenCL plaform definitions for the Intel Xeon PHI?

The only numbers I have are for a 5110P are:

MAX_COMPUTE_UNITS = 236

And the commercial information of: 60 Intel Cores within the Xeon PHI.

Many thanks!

Loc_N_Intel · ‎02-08-2013

Hello,

The document "OpenCL* Optimization Guide" found in the Intel SDK for OpenCL Applications 2012 ( http://software.intel.com/en-us/vcsource/tools/opencl-sdk ) talked about how to retrieve the CL_DEVICE_MAX_COMPUTE_UNITS (using clGetDeviceInfo). In the case of the current Intel(R) Xeon Phi(TM) Coprocessor, this value is 240 because it has 60 cores, each core have 4 hardware threads.

Thank you.