"On a multicore system, your goal is to spread the work efficiently among many cores so that it does executes simultaneously. And performance gain should be directly related to how many cores you have. So, a quad core system should be able to get the work done 4 times faster than a single core system. A 16-core platform should be 4-times faster than a quad-core system, and 16-times faster than a single core...
That's where my Threadpool comes in hand, it spreads the work efficiently among many cores.
The following have been added to Threadpool:
Lock-free ParallelQueue for less contention and more efficiency or it can use lockfree_mpmc - flqueue that i have modified, enhanced and improved... - See lock-free ParallelQueue: