The Basics of Parallel Architecture: What Should I Know?

Gerald_M_Intel · ‎02-20-2009

So my question is pretty simple and straight forward. I am a non-technical guy. When it comes to Parallel Programming I wouldn't even know where to start, and to be perfectly honest, when it comes to good old fashioned programming I wouldn't know where to begin.

What would I have to know to begin wrapping my head around the idea of Parallel Programing? Lets see what shakes loose.

Cheers,
Jerry

TimP · ‎02-20-2009

Parallel programming has become important as a way to use multiple cores or CPUs, or cluster computing, effectively. Many people have lost sight of considerations which should come ahead of that, such as vectorization, when that is applicable. Once you decide to get involved in parallel program development, and want to choose a programming model to start out with, you get into an area where the fashions have swung back and forth in just the last 5 years. Some of the introductory course material is intended to give an overview of the options, so it already gets beyond the scope of this forum, unless you give more idea to help people make specific suggestions.
OpenMP has regained momentum as a basis for shared memory parallelism; it's the basis for the Intel C, C++, and Fortran compiler auto-parallel options, Parallel Studio, and MKL threading. MPI also gained momentum; it's been running well ahead of general server market growth. For those who stay within C++, TBB has proven to be useful. You'll find a lot of both promotional and solid reference material on all of those in these forum areas.

fraggy · ‎02-22-2009

Quoting - Jerry Makare (Intel)

What would I have to know to begin wrapping my head around the idea of Parallel Programing? Lets see what shakes loose.

couple of things, you have to know :

- difference between multithread and parallel : multithreading is use to do parallel programming, but a multithreaded software is not always a parallel program. Parallel programming really comes in the client side with multicore processor architecture, but it exists for many years...

- multicore architecture is a semi conductor driven evolution (since multiplying Ghz is over, intel have to multiply core to stay competitive)

- multicore is for 2->12 core machine, manycore stand for >>> 12 cores (IMHO)

-from now on every new software should be made with a strong parallel point of view, this is the reason why developers have to inprove their multithreading skills.

- to parallelize your existing application (classic way), find the parallelizable parts and use tools like TBB, Cilk++, OpenMP to get the job done.

- to parallelize your existing application (hardwore way), rewrite the application core to provide more parallelizable parts and use tools like TBB, Cilk++,OpenMP to get the job done.

- lots of existing applications can be parallelize in the classic way, don't be worry for them

- some application shows very small parallizable parts (video game :)), without refactoring you will never get linear performance scaling on manycore.

-TBB, Cilk++, OpenMP (and all the others) are powerfull tools butit will not help you to redesign your application in the right way. Theyare like the best paintbrush ever, if your not Caravagio they'll be a very little help to get the job done.

- about refactoring strategy, you should always try to use data parallelism in your software, it's the only way to experience linear performance scaling on manycore processors (for video game : http://www.gamasutra.com/view/feature/1830/multithreaded_game_engine_.php). Never use functional parallel model if you want linear performance scaling. For instance, in video game basic functions (3D IA Physics...) are compute in parallel (in different thread), this is very bad for performance scaling on future manycore.

Vincent, hope it help

srimks · ‎02-24-2009

Quoting - Jerry Makare (Intel)

So my question is pretty simple and straight forward. I am a non-technical guy. When it comes to Parallel Programming I wouldn't even know where to start, and to be perfectly honest, when it comes to good old fashioned programming I wouldn't know where to begin.

What would I have to know to begin wrapping my head around the idea of Parallel Programing? Lets see what shakes loose.

Cheers,
Jerry

To pin down - How to learn Parallel Programming, few things would be needed -

(a) Learn first to choice the architecture on which you wish to execute the Parallel Program - SMP, Distributed MP, Cluster, NUMA & Distributed SMP. Identifying the architecture 'll make you to execute & write Parallel Program.

(b) If you are with SMP, try learning the behaviour of Implicit & Explicit Vectorizations. Also, try learning Auto-Parallelization andknowing the threads (POSIX, OpenMP, etc.) behaviour.

(c) If the need is Distributed MP, try learning MPI programming with Send & Recive API to start with.

(d) If the need is Cluster computing, try learning both OpenMP & MPI.

Think ofachieving maximum scalability for your Parallel Programming by using better scientific libraries (MKL, ACML, Atlas, GoToBlas, etc.) and reliable network stack (IB, etc.).

Off-course tools for Parallel Programming is also needed, like Intel VTune, Intel Thread Checker, Intel Trace Analyzer & Checker, SPEC-MPI2007, IMB-v3.1, Total View Debugger, Marmot, etc.

Never align with people's solution of HPC Parallel Languages like - Fortress, Cilk, Manticore, Ct, etc...all tried to address the HPC needs for Parallel programming but they missed to address the issues of performance for HPC. These languages are good only in Research Labs but not as commercial.

~BR

lessonfree · ‎03-02-2009

Quoting - Jerry Makare (Intel)

Everyone gave almost perfect answers, but if you are non-technical guy - may be next example helps you.

You are the BOSS and you have some number of employees. To distribute the amount of work between workers is your responsibility. If you can do it with people - you can do parallel processing with only one difference (lets do not take hetero-systems into account) --- all your employees can do any job with the same speed and almost do not need to be controlled.

So my question is pretty simple and straight forward. I am a non-technical guy. When it comes to Parallel Programming I wouldn't even know where to start, and to be perfectly honest, when it comes to good old fashioned programming I wouldn't know where to begin.

What would I have to know to begin wrapping my head around the idea of Parallel Programing? Lets see what shakes loose.

Cheers,
Jerry

spm · ‎03-03-2009

I prefer to think of parallel programming in a way that transcends the moment. So, ...

Parallelism is nothing more than the management of a collection of serial tasks

where
management refers to the policies by which (a) tasks are scheduled, (b) premature
terminations are handled, (c) preemptive support is provided, (d) communication primitives
are enabled/disabled, and (e) the manner in which resources are obtained and released,

and
serial tasks are classified in terms of either (a) Coarse grain -- where tasks may not
communicate prior to conclusion, or (b) Fine grain -- where tasks may communicate
prior to conclusion

That's it! Of course, the devil is in the details (both hardware and software).

fraggy · ‎03-03-2009

Quoting - spm

Parallelism is nothing more than the management of a collection of serial tasks

where
management refers to the policies by which (a) tasks are scheduled, (b) premature
terminations are handled, (c) preemptive support is provided, (d) communication primitives
are enabled/disabled, and (e) the manner in which resources are obtained and released,

and
serial tasks are classified in terms of either (a) Coarse grain -- where tasks may not
communicate prior to conclusion, or (b) Fine grain -- where tasks may communicate
prior to conclusion

That's it! Of course, the devil is in the details (both hardware and software).

I like this answer :p

can you explain why "serial" tasks ? if the task were serial you will not be able to parallelize them...

robert-reed · ‎03-03-2009

Quoting - fraggy

I like this answer :p

can you explain why "serial" tasks ? if the task were serial you will not be able to parallelize them...

Well, that's really all any parallel program is, a collection of piecewise serial programs that do their work and the more of them that can do their work without interfering with their peers, the higher the concurrency level (scaling) the application can achieve. Think of a parallel for running across an array of values: each thread takes a section and serially runs a regular for loop over a portion of the array--it's a serial program for all intents and purposes.

The problem of course is that it's only piecewise serial: when each of the serial sections in our parallel-for finish, they need to rendezvous (join) with the other workers finishing other sections of the array so that code following the parallel for can run with the assumption that all the array elements have been processed. And the serial sections can be even shorter: an algorithm that needs to update some shared common state each time through the loop can only count as serial the portion from one state update to the next. And the more frequent the interruptions, the more time is spent in a truly serial state, where only one thread can progress through some critical section, the greater the limitation on the scalability of that algorithm.

spm · ‎03-03-2009

To re-state my previous comment ...

There is a tight coupling between
+) Scheduling of tasks (across resources -- threads/processes) on one hand, and
+) Communication primitives any of the said tasks may utilize on the other-hand

I.e. they are two sides of the same coin.

So, for example, if I want to schedule a task in isolation of any other task, I need to
ensure that the said task will not be able to communicate with any other task.

Conversely, for example, if I want N tasks to communicate among themselves, I have
no choice but to manage the N resources as a single unit; i.e. if one resource goes
down, the rest of them are forcibly shutdown.

Take this approach to its logical conclusion, and you have a wide and rich variety of
(scheduler + communication primitives) tuples.

Again, note that everything stated transcends the moment.

jimdempseyatthecove · ‎03-04-2009

>>So, for example, if I want to schedule a task in isolation of any other task, I need to
ensure that the said task will not be able to communicate with any other task.

Additionally you want to inhibit non-communication side effects such as multiple threads inadvertantly using the same temporary variables, inadvertantly working on (modifying) objects that they may share, inapropriate file writes, etc...

>>Conversely, for example, if I want N tasks to communicate among themselves, I have
no choice but to manage the N resources as a single unit; i.e. if one resource goes
down, the rest of them are forcibly shutdown.

Not necessarily. It is better if the N resources are independently managed i.e you have N mutex or N+1 mutex (one for each, + one for all)

Jim Dempsey