Usage of Wait() in threadpool

coolsandyforyou · ‎03-28-2010

Hi,

i want to know the usage of threadpool wait function, that could cause the main() thread to wait till the work done by QUeUserWorkItem() finishes...

give me a small xample..

Dmitry_Vyukov · ‎03-28-2010

Here it is. Main thread creates 3 tasks, and then waits for completion.

[cpp]struct task
{
    long rc;
    HANDLE ev;
};

struct work_item
{
    task* parent;
    int work;
};

DWORD WINAPI func(void* p)
{
    std::auto_ptr w ((work_item*)p);
    std::cout << GetTickCount() << " TASK START" << std::endl;
    Sleep(w->work);
    std::cout << GetTickCount() << " TASK END" << std::endl;
    if (InterlockedDecrement(&w->parent->rc) == 0)
        SetEvent(w->parent->ev);
    return 0;
}

int main()
{
    size_t const work_count = 3;

    std::cout << GetTickCount() << " MAIN START" << std::endl;

    std::auto_ptr t (new task);
    t->rc = work_count;
    t->ev = CreateEvent(0, 0, 0, 0);

    for (size_t i = 0; i != work_count; i += 1)
    {
        work_item* w = new work_item;
        w->parent = t.get();
        w->work = rand() % 2000 + 1000;
        QueueUserWorkItem(func, w, 0);
    }

    WaitForSingleObject(t->ev, INFINITE);

    std::cout << GetTickCount() << " MAIN END" << std::endl;
}
[/cpp]

Dmitry_Vyukov · ‎03-28-2010

Here it is. Main thread creates 3 tasks, and then waits for completion.

[cpp]struct task
{
    long rc;
    HANDLE ev;
};

struct work_item
{
    task* parent;
    int work;
};

DWORD WINAPI func(void* p)
{
    std::auto_ptr w ((work_item*)p);
    std::cout << GetTickCount() << " TASK START" << std::endl;
    Sleep(w->work);
    std::cout << GetTickCount() << " TASK END" << std::endl;
    if (InterlockedDecrement(&w->parent->rc) == 0)
        SetEvent(w->parent->ev);
    return 0;
}

int main()
{
    size_t const work_count = 3;

    std::cout << GetTickCount() << " MAIN START" << std::endl;

    std::auto_ptr t (new task);
    t->rc = work_count;
    t->ev = CreateEvent(0, 0, 0, 0);

    for (size_t i = 0; i != work_count; i += 1)
    {
        work_item* w = new work_item;
        w->parent = t.get();
        w->work = rand() % 2000 + 1000;
        QueueUserWorkItem(func, w, 0);
    }

    WaitForSingleObject(t->ev, INFINITE);

    std::cout << GetTickCount() << " MAIN END" << std::endl;
}
[/cpp]

coolsandyforyou · ‎03-28-2010

But t->ev will be set even if one of the thread completes its job ,shouldnt we supposed to wait till the completion of the entire job (all threads completed)?...

coolsandyforyou · ‎03-28-2010

also explain me abt the InterlockedDecrement()..

another thing i executed ur code and observed that the things were not done in parallel they were done one after the other...this is the output that i got..

10639468 MAIN START

10639484 TASK START

10640515 TASK END

10640515 TASK START

10641984 TASK END

10641984 TASK START

10643328 TASK END

1

Dmitry_Vyukov · ‎03-28-2010

Compile and run the program.
Only the last job will set up the event because of the "if(InterlockedDecrement(&w->parent->rc)==0) ".

Dmitry_Vyukov · ‎03-28-2010

> also explain me abt the InterlockedDecrement()..

RTFM first
http://msdn.microsoft.com/en-us/library/ms684122%28VS.85%29.aspx

Dmitry_Vyukov · ‎03-29-2010

> another thing i executed ur code and observed that the things were not done in parallel they were done one after the other...this is the output that i got..

Why do you conclude that they were done one after another?

coolsandyforyou · ‎03-29-2010

the timings....after the completion of one task only another one begins...it seems like that seeing the timings of gettickcount() there should be overlapping intervals,this made me think so...

also main thread is waiting for infinite time (last statement "MAIN END" is not printed)

Dmitry_Vyukov · ‎03-29-2010

Output on my machine is:
666519357 MAIN START
666519372 TASK START
666519372 TASK START
666519372 TASK START
666520418 TASK END
666520714 TASK END
666520839 TASK END
666520839 MAIN END

Probably you have only 1 hardware thread, or old OS, or something like that. Try to play with flags to QueueUserWorkItem(), especially WT_EXECUTEINIOTHREAD and WT_EXECUTELONGFUNCTION, I think they must help.

coolsandyforyou · ‎03-29-2010

[bash]#include 
#include
#include
#define _WIN32_WINNT 0x0503
struct task  
{  
    long rc;  
    HANDLE ev;  
};  
  
struct work_item  
{  
    task* parent;  
    int work;  
};  
  
DWORD WINAPI func(void* p)  
{  
    work_item *w =((work_item*)p);  
    std::cout << GetTickCount() << " TASK START" << std::endl;  
    Sleep(w->work);  
    std::cout << GetTickCount() << " TASK END" << std::endl;  
    if (InterlockedDecrement(&w->parent->rc) == 0)  
        SetEvent(w->parent->ev);  
    return 0;  
}  
  
int main()  
{  
    size_t const work_count = 3; 
	struct task t;

    std::cout << GetTickCount() << " MAIN START" << std::endl;  
    
    t.rc = work_count;  
    t.ev = CreateEvent(0, 0, 0, 0);  
  
    for (size_t i = 0; i != work_count; i += 1)  
    {  
        work_item* w = new work_item;  
        w->parent = (struct task *)malloc(sizeof(struct task));  
        w->work = rand() % 2000 + 1000;  
        QueueUserWorkItem(func, w, WT_EXECUTELONGFUNCTION);  
    }  
  
    WaitForSingleObject(t.ev, INFINITE);  

    std::cout << GetTickCount() << "MAIN END" << std::endl;  
}  
[/bash]

i made small changes in the instantiation of task,work objects in ur code(changed them with C syntaxes),as ur code giving build errors in my machine.. here i am not getting MAIN END in the output..seems the main thread is waiting for infinite time...pls see where things have gone wrong.. thanks in advance..

Dmitry_Vyukov · ‎03-29-2010

w->parent= &t;

coolsandyforyou · ‎03-29-2010

Thanks a lot...for replying...knowing these things mean a lot to my M.tech project. I need to apply these technique to my video codec to improve its performance to real time..

thanks again..

coolsandyforyou · ‎04-01-2010

Hi Dmitriv,

as i said earlier i need to apply this technique to my video codec project..and i have done that succesfully..results were also same with and without threading,i observed tht the threads were created and each slice of a frame is being encoded parallely...

But here is my issue..i run this project on a core2duo machine...But im not seeing any speed improvement infact the proj with threading is running bit slowly....I can send you both of the projects (with and without threading)please give me feedback where the things going wrong..

The tasks to which i applied threading are inherently independent in nature..i dont see a reason why i dont deserve a double performance..

pls reply..

Dmitry_Vyukov · ‎04-01-2010

Ideally, provided perfect parallelization one has: 2x consumed CPU time + 2x more useful work done.
There are 2 possible scenarios wrt your application: (1) No 2x consumed CPU time, but maybe only 1.1x; or (2) 2x consumed CPU time, but still only 1.1 useful work done.
Try to identify with a profiler (or other tools) what scenario you have. If (1) then you do not expose enough parallelism to get 2x speedup. If (2) then there is some problem in your implementation, profiler can show in what function the problem is.

coolsandyforyou · ‎04-17-2010

are there any thread pool kind of functions for linux?