Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

Usage of Wait() in threadpool

coolsandyforyou
Beginner
921 Views
Hi,
i want to know the usage of threadpool wait function, that could cause the main() thread to wait till the work done by QUeUserWorkItem() finishes...
give me a small xample..
0 Kudos
15 Replies
Dmitry_Vyukov
Valued Contributor I
921 Views
Here it is. Main thread creates 3 tasks, and then waits for completion.

[cpp]struct task
{
    long rc;
    HANDLE ev;
};

struct work_item
{
    task* parent;
    int work;
};

DWORD WINAPI func(void* p)
{
    std::auto_ptr w ((work_item*)p);
    std::cout << GetTickCount() << " TASK START" << std::endl;
    Sleep(w->work);
    std::cout << GetTickCount() << " TASK END" << std::endl;
    if (InterlockedDecrement(&w->parent->rc) == 0)
        SetEvent(w->parent->ev);
    return 0;
}

int main()
{
    size_t const work_count = 3;

    std::cout << GetTickCount() << " MAIN START" << std::endl;

    std::auto_ptr t (new task);
    t->rc = work_count;
    t->ev = CreateEvent(0, 0, 0, 0);

    for (size_t i = 0; i != work_count; i += 1)
    {
        work_item* w = new work_item;
        w->parent = t.get();
        w->work = rand() % 2000 + 1000;
        QueueUserWorkItem(func, w, 0);
    }

    WaitForSingleObject(t->ev, INFINITE);

    std::cout << GetTickCount() << " MAIN END" << std::endl;
}
[/cpp]
0 Kudos
Dmitry_Vyukov
Valued Contributor I
921 Views
Here it is. Main thread creates 3 tasks, and then waits for completion.

[cpp]struct task
{
    long rc;
    HANDLE ev;
};

struct work_item
{
    task* parent;
    int work;
};

DWORD WINAPI func(void* p)
{
    std::auto_ptr w ((work_item*)p);
    std::cout << GetTickCount() << " TASK START" << std::endl;
    Sleep(w->work);
    std::cout << GetTickCount() << " TASK END" << std::endl;
    if (InterlockedDecrement(&w->parent->rc) == 0)
        SetEvent(w->parent->ev);
    return 0;
}

int main()
{
    size_t const work_count = 3;

    std::cout << GetTickCount() << " MAIN START" << std::endl;

    std::auto_ptr t (new task);
    t->rc = work_count;
    t->ev = CreateEvent(0, 0, 0, 0);

    for (size_t i = 0; i != work_count; i += 1)
    {
        work_item* w = new work_item;
        w->parent = t.get();
        w->work = rand() % 2000 + 1000;
        QueueUserWorkItem(func, w, 0);
    }

    WaitForSingleObject(t->ev, INFINITE);

    std::cout << GetTickCount() << " MAIN END" << std::endl;
}
[/cpp]
0 Kudos
coolsandyforyou
Beginner
921 Views
But t->ev will be set even if one of the thread completes its job ,shouldnt we supposed to wait till the completion of the entire job (all threads completed)?...
0 Kudos
coolsandyforyou
Beginner
922 Views
also explain me abt the InterlockedDecrement()..
another thing i executed ur code and observed that the things were not done in parallel they were done one after the other...this is the output that i got..
10639468 MAIN START
10639484 TASK START
10640515 TASK END
10640515 TASK START
10641984 TASK END
10641984 TASK START
10643328 TASK END
1
0 Kudos
Dmitry_Vyukov
Valued Contributor I
922 Views
Compile and run the program.
Only the last job will set up the event because of the "if(InterlockedDecrement(&w->parent->rc)==0) ".

0 Kudos
Dmitry_Vyukov
Valued Contributor I
922 Views
> also explain me abt the InterlockedDecrement()..

RTFM first
http://msdn.microsoft.com/en-us/library/ms684122%28VS.85%29.aspx
0 Kudos
Dmitry_Vyukov
Valued Contributor I
922 Views
> another thing i executed ur code and observed that the things were not done in parallel they were done one after the other...this is the output that i got..

Why do you conclude that they were done one after another?
0 Kudos
coolsandyforyou
Beginner
922 Views
the timings....after the completion of one task only another one begins...it seems like that seeing the timings of gettickcount() there should be overlapping intervals,this made me think so...
also main thread is waiting for infinite time (last statement "MAIN END" is not printed)
0 Kudos
Dmitry_Vyukov
Valued Contributor I
922 Views
Output on my machine is:
666519357 MAIN START
666519372 TASK START
666519372 TASK START
666519372 TASK START
666520418 TASK END
666520714 TASK END
666520839 TASK END
666520839 MAIN END

Probably you have only 1 hardware thread, or old OS, or something like that. Try to play with flags to QueueUserWorkItem(), especially WT_EXECUTEINIOTHREAD and WT_EXECUTELONGFUNCTION, I think they must help.

0 Kudos
coolsandyforyou
Beginner
922 Views
[bash]#include 
#include
#include
#define _WIN32_WINNT 0x0503
struct task  
{  
    long rc;  
    HANDLE ev;  
};  
  
struct work_item  
{  
    task* parent;  
    int work;  
};  
  
DWORD WINAPI func(void* p)  
{  
    work_item *w =((work_item*)p);  
    std::cout << GetTickCount() << " TASK START" << std::endl;  
    Sleep(w->work);  
    std::cout << GetTickCount() << " TASK END" << std::endl;  
    if (InterlockedDecrement(&w->parent->rc) == 0)  
        SetEvent(w->parent->ev);  
    return 0;  
}  
  
int main()  
{  
    size_t const work_count = 3; 
	struct task t;

    std::cout << GetTickCount() << " MAIN START" << std::endl;  
    
    t.rc = work_count;  
    t.ev = CreateEvent(0, 0, 0, 0);  
  
    for (size_t i = 0; i != work_count; i += 1)  
    {  
        work_item* w = new work_item;  
        w->parent = (struct task *)malloc(sizeof(struct task));  
        w->work = rand() % 2000 + 1000;  
        QueueUserWorkItem(func, w, WT_EXECUTELONGFUNCTION);  
    }  
  
    WaitForSingleObject(t.ev, INFINITE);  

    std::cout << GetTickCount() << "MAIN END" << std::endl;  
}  
[/bash]





i made small changes in the instantiation of task,work objects in ur code(changed them with C syntaxes),as ur code giving build errors in my machine.. here i am not getting MAIN END in the output..seems the main thread is waiting for infinite time...pls see where things have gone wrong.. thanks in advance..
0 Kudos
Dmitry_Vyukov
Valued Contributor I
922 Views
w->parent= &t;

0 Kudos
coolsandyforyou
Beginner
922 Views
Thanks a lot...for replying...knowing these things mean a lot to my M.tech project. I need to apply these technique to my video codec to improve its performance to real time..
thanks again..
0 Kudos
coolsandyforyou
Beginner
922 Views
Hi Dmitriv,
as i said earlier i need to apply this technique to my video codec project..and i have done that succesfully..results were also same with and without threading,i observed tht the threads were created and each slice of a frame is being encoded parallely...
But here is my issue..i run this project on a core2duo machine...But im not seeing any speed improvement infact the proj with threading is running bit slowly....I can send you both of the projects (with and without threading)please give me feedback where the things going wrong..
The tasks to which i applied threading are inherently independent in nature..i dont see a reason why i dont deserve a double performance..
pls reply..
0 Kudos
Dmitry_Vyukov
Valued Contributor I
922 Views
Ideally, provided perfect parallelization one has: 2x consumed CPU time + 2x more useful work done.
There are 2 possible scenarios wrt your application: (1) No 2x consumed CPU time, but maybe only 1.1x; or (2) 2x consumed CPU time, but still only 1.1 useful work done.
Try to identify with a profiler (or other tools) what scenario you have. If (1) then you do not expose enough parallelism to get 2x speedup. If (2) then there is some problem in your implementation, profiler can show in what function the problem is.

0 Kudos
coolsandyforyou
Beginner
922 Views
are there any thread pool kind of functions for linux?
0 Kudos
Reply