Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.

TBB I/O Support?

I just got Tbb running on open cv using gcc and g++ . i did disable the -Werror flag in the makefile.tbb and now the facedetect code is effectively parallelized on my quad core ... though far from optimized (speedup 2.8x to 3.2x). i get 3.6x on open mp parallelization I did run into another problem though ..... i want to read 2 cameras in parallel with tbb.
exact problem:
the camera read routine is an infinite loop like
while (1)
read camera;
show image;
release image;
if !image

it compiles all well but only reads one loop at a time even though 2 of these loops should be running parallelly.

once i manually destroy the window of one thread the other thread starts....

if i make it as a read of 2 images (disabling the loop) it works fine for one frame each... so i implemented the infinite loop outside the parallel for call
which led to immense slowdown.
less than a frame per second....
thread checker showed me that a thread has been waiting for more than three seconds....
any ideas....?

0 Kudos
3 Replies
Valued Contributor II

I split this reply into its own thread more in keeping with its topic. I've done some work on overlapped I/O and TBB using pipelines, but I'm struggling to grasp the technique you describe. The text mentions a parallel_for but the only code snippet is the read routine that employs a while (1). Is the parallelism in your one camera case done on a frame by frame? (I have not worked at all with OpenCV.)

The description you give of one stream not appearing until the window displaying the other stream is destroyed sounds like resource contention in the window-level display routines.

You might be able to handle two cameras in a TBB pipeline but to be fair, you'd probably want to interleave and mark the frames so as to be able to disambiguate them later. The mention of a face-detection algorithm suggests there is some processing involved. Is the per-frame processing uniform, or might there be load balance issues because ofvarying frame computational loads?

Some more work on this yielded following observations/conclusions etc.....

The while(1) routine was being called from inside the parallel for. i did some more work on it and found that due to tbb's non - preemptive scheduler one infinite loop (the while(1))... once entered must be terminated so that another one may be initiated.
Taking the while(1) from inside the routine called by parallel_for i placed it outside the parallel_for call thereby repeatedly invoking the calls to the whole routine and initialized the heavy data structures outside sequentially.
So what i essentially did was initalize the 2 i/o routines sequentially and then started servicing them parallelly. and the crash was the result of the non thread safe window drawing functions from glib . i stopped drawing them from inside the parallel routine and serviced the drawing functions sequentially again.
As far as the load balancing goes, ive parallelized the pixel by pixel reading of a frame to get good speedups and load balancing , however owing to the nature of the algorithm there are some sequential steps which cannot be parallelized so im not really worried about the load balancing
I'll post the whole approach on the web soon and provide a link.
Valued Contributor II

Yes the TBB task scheduler is non-preemptive and unfair (for example, letting tasks select their successors), its philosophical goal to keep threads executing on the processing element upon which they're currently resident for as long as they have local work to do(to maximize cache residence). This typically is not the policy you want in a scheduler that balances work among threads and processes that are doing I/O. It was unclear from your initial post what the relationship was between the parallel_for call and the while (1) loop or loops used for the video capture and display. Knowing that the infinite loop was inside the TBB parallel construct and knowing about the behavior of the TBB scheduler, it should be pretty clear what went wrong.

I'm glad you were able to figure out the source of the hangs and I look forward to your report on the details of your approach. Thanks for providing this update.