Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

About my threadpool engine...

aminer10
Novice
352 Views

Hello,

In my threadpool engine the worker threads enters in a wait state
when there is no job in the lock-free queues - for more efficiency -
you can download the source code from:

but if you look at the threadpool engine source code i am using
the following code on the producer side:
---
events[local_balance].setevent;
---
and using the followingcode on the consumer side
:
---
if ThreadPool.Queues[self.threadcount].count <> 0
then continue;
for i:=0 to FThreadpool.Fthreadcount-1 do count:=count+FThreadPool.Queues.count;
if count=0 then
begin
FThreadpool.events[self.threadcount].waitfor(INFINITE);
FThreadpool.events[self.threadcount].resetevent;
end;
--

So a question follows..

If for example the consumer thread is on
FThreadpool.events[self.threadcount].waitfor(INFINITE);
and it receives two items before entering the
FThreadpool.events[self.threadcount].resetevent;
can it forget to process the second item cause the consumer thread
will reset the event but one item will still be on the queue...

Answer:
No, it will still process correctly the second item cause we arecatching
the number of items with the following code on the consumer side:
---
if ThreadPool.Queues[self.threadcount].count <> 0
then continue;
for i:=0 to FThreadpool.Fthreadcount-1 do count:=count+FThreadPool.Queues.count;
---


Thank you.
Amine Moulay Ramdane.
0 Kudos
2 Replies
aminer10
Novice
352 Views

Hello,


As you will notice my threadpool engine is a simpleand efficient threadpool engine , i have designed it like that to easy the learning step for those who want to learn how to implementa simple and efficient threadpool.

On a multicore system, your goal is to spread the work efficiently among many cores so that it does executes simultaneously. And performance gain should be directly related to how many cores you have. So, a quad core system should be able to get the work done 4 times faster than a single core system. A 16-core platform should be 4-times faster than a quad-core system, and 16-times faster than a single core...

That's where my Threadpool is usefull , it spreads the work efficiently among many cores. Threadpool (and Threadpool with priority) consist of lock-free thread safe/concurrent enabled local FIFO queues of work items, so when you call ThreadPool.execute() , your work item get queued in the local lock-free queues. The worker threads pick them out in a First In First Out order (i.e., FIFO order), and execute them. .


The following have been added to Threadpool:

- Lock-free_mpmc - flqueue that i have modified, enhanced and improved... -

- It uses a lock-free queue for each worker thread and it uses work-stealing - for more efficiency -

- The worker threads enters in a wait state when there is no job in the lock-free queues - for more efficiency -

- You can distribute your jobs to the worker threads and call any method with the threadpool's execute() method.

Work-Stealing scheduling algorithm offer many feature over the ordinary scheduling algorithm:

  1. Effective:
    • Using local queues, this will minimize contention.
  2. Load Balancing:
    • Every thread can steal work from the other threads, so Work-Stealing provides implicitly Load Balancing.

My Threadpool allows load balancing, and also minimize contention.



Thank you.
Amine Moulay Ramdane.




0 Kudos
gaston-hillar
Valued Contributor I
352 Views
Amine,
I think that it would be a better idea for you to write an article with a complete example of the usage of your thread engine on different Intel architectures, with several sample algorithms.
That is the best way someone in the forums will pay attention to your work.
It is not a good idea to promote your engine by adding dozens of messages in these forums.
Cheers,
Gaston Hillar
0 Kudos
Reply