A sample code, of thread pooling (with pthreads in C) demonstrating avoiding the overheads of creating and destroying threads, would be helpful.
(I searched Intel.com and but got C++ examples.
Have visited -
Code in C would be more helpful.)
Of course, it's possible. There is an example implementation in Pthreads Programming (O'Reilly, 1996) by Nichols, Buttlar and Farrell. I can't reproduce their code here, but I can give you the gist of their implementation. The concrete example they use to illustrate the concepts is a bankATM
The center of a thread pool will be how to assign work to the worker threads. A queue of structs that hold job descriptions is easy to implement. You need to protect access of the queue by multiple threads and be able to know when the queue is empty (and pause pool threads) and full (if needed). A semaphore will work for keeping count of the number of items in the queue and threads will be paused when the count is zero. A semaphore can also be used to keep track of the number of available slots in the queue if you're not using a linked list.
Beyond that it is just a matter of initializing things (queue, semaphores) and launching the threads that will be in the pool. The pool threads wait on the semaphore to be greater than zero (work in the queue), pull out a task from the queue (and decrement the semaphore), then go to work on the task before repeating the above.
Of course, you need to have one or more threads involved in generating the tasks for the thread pool threads. In the ATM example, this is the interface thread that accepts transactions and places them in the queue for the pool threads to work on. This "master" thread could likely be the process thread, and when the computation is complete, the master thread terminates the thread pool threads.
If you code things carefully and depending upon the computation, the master thread could be one of the pool threads and do some of the work. As an example, think about searching a tree. Starting from the root, at each node of the tree, the task to be done is place pointers to the children of the current node into the queue, process the node, and then get the next node for processing from the queue. To keep things straight or to monitor when the tree search is done, you can keep a thread out of the search process, but there's no reason that all thread's can't be searching (with one paying attention to the stopping criteria being met before terminating threads).
Hope that helps.
Brushing my teeth this morning, I had another idea. The above method is modeled on Producer/Consumer. If you have a set of functions (ATM transactions) that are executed when work is available, this model works well. If, on the other hand, you have a regular set of concurrent computations that get repeated with some intervening serial work, you can also use a thread pool based more on the Boss/Worker model.
Of course, there is a Boss thread that coordinates the Worker threads (in the pool).The classic Boss/Worker needs some form of rendezvous between the Boss and a Worker to exchange information (Boss gets results and the Worker gets the next task to be executed). This can be done in Pthreads and I've got code for this, but it is a tad (overly) complex, which is why I recommend Producer/Consumer for threads.
The thread pool implementation I imagined this morning would be used in a case where parallel work was done and the results were checked to see if the final answer has been reached. Think of an iterative solution to linear equations. You make an initial guess and then iteratively refine the possible solution until the answer computed is within some tolerance. The same refining computationsare executed by Worker threads while the Boss checks the tolerance of the solution before terminating the Workers or signaling for an additional iteration.
Here isa sketch of the details for this type of thread pool: the Worker threads are launched and wait at a condition variable. If threads are going to be done the same work each time they run (like working on the same set of matrix indices), this can be built into the threads; if the tasks might change from one use to another, then there will need to be some mechanism to get new tasks from the Boss thread. After the condition variable is signaled, the pool threads execute their assigned tasks and then go back to waiting on the condition variable. In the meantime, the Boss thread sets up tasks for the Workers and signals the condition variable the Workers wait on. The Boss waits for the Workers to complete their assigned tasks (somehow the Workers need to signal the Boss that *all* Workers have finished). The Boss repeats these steps until the computation is finished (e.g., the solution is within an acceptable tolerance).
Essentially, thread pools need some method to get work and receive signals that they can begin processing. This can be done with a queue or explicit signals through conditional variables.
Good luck with your thread pools.