Software Archive
Read-only legacy content
17061 Discussions

How to put two Xeon Phis to work?

Aaron_S_
Beginner
363 Views

I recently acquired a system with dual xeon phi cards.  How best to put both cards to work?  At the moment, I can only afford the C++ Composer software -- so MPI isn't an option.

 

0 Kudos
3 Replies
Sunny_G_Intel
Employee
363 Views

Hi Aaron,

Please refer to the following Intel Xeon Phi system administration guide to get started. 

https://software.intel.com/sites/default/files/managed/bd/53/System_Administration_Guide_Intel%28R%29XeonPhi%28TM%29Coprocessor.pdf

Intel MPSS user guide is a great reference too. 

In order to assist you better can you please explain for what purpose you intend to use the coprocessors. 

Thanks,

0 Kudos
Andrey_Vladimirov
New Contributor III
363 Views

As Sunny said, the best approach depends on the purpose for which you intend to use the coprocessors, specifically, the pattern of parallelism and communication.

  • If you can run two independent Linux processes to do your workload (batch processing scenario), you can use the native programming model and start jobs on coprocessors in parallel using "ssh" like you would on two independent general-purpose machines.
  • If you have one process that has multiple independent work-items to compute ("embarrassingly parallel" code, no communication), you can use the offload model to send independent work-items to different coprocessors. Start two threads on the host and map each thread to the respective coprocessor:
const int nDevices = _Offload_number_of_devices(); 
#pragma omp parallel num_threads(nDevices)
  {
    const int i = omp_get_thread_num();
#pragma offload target(mic: i)
      {
        MyFunction(/*...*/ );
      }
 }
  • In the same way you can distribute a set of work-items between coprocessors:
const int nDevices = _Offload_number_of_devices(); 
#pragma omp parallel num_threads(nDevices)
{
  const int iDevice = omp_get_thread_num();
#pragma omp for schedule(dynamic, 1)
  for (int i = 0; i < nWorkItems; i++) {
#pragma offload target(mic: iDevice)
    {
      MyFunction(i);
    } 
  }
}

If you need communication between coprocessors, this is more complex. You can indirectly communicate between coprocessors by passing messages to/from host, but this would require synchronization at communication. This is where MPI would be a good tool.

We have a comprehensive free Web-based training coming soon where you can learn more: http://colfaxresearch.com/how-series/

0 Kudos
Loc_N_Intel
Employee
363 Views

Hi Aaron,

In addition to the above techniques, you can also use the Intel(R) hStreams library. This approach offers an abstraction that controls the compute capabilities of a heterogeneous system. The Intel(R) hStream library can be used on Intel Xeon processors and Intel(R) Xeon Phi(TM) coprocessors.

You can download hStreams binaries from:

•             https://01.org/sites/default/files/downloads/hetero-streams-library/hstreams-1.0.0.tar  (Linux)

•             https://01.org/sites/default/files/downloads/hetero-streams-library/hstreams-1.0.0.zip  (Windows)

 

 

0 Kudos
Reply