- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We have an application we are parallelizing using a pipeline. VTune shows that a large amount of time is spent in the TBB scheduler. Does this signify that the pipeline is starved of work? I'm not sure where to begin understanding this. Any help will be appreciated.
Below is the summary from VTune "Locks and Waits" analysis. You can see that of the total wait time of 28 seconds, 24 of those seconds were from the TBB scheduler.
-Jeff
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think I understand what's happening. The parallel filter is being starved of tasks because we're doing too much work in the final serial filter.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you cannot get the tokens out (of your last serial filter) then they cannot be recirculated to the input filter.
Try to do as little as possible in the input and output filters. i.e. if possible pull the "work" portion of the output filter into the parallel interior filters. If your output filter is reduced down to a single file write, then other than experimenting with buffer size there is not much else you can do.
With TBB you might try changing your output filter to one that packs a larger I/O delivery buffer (you have a pool of these preallocated). i.e. change a larger number of small writes to a smaller number of larger writes. You would have to handle the last partial buffer write (possibly by passing a 0 length buffer through the output filter).
Jim Dempsey
Try to do as little as possible in the input and output filters. i.e. if possible pull the "work" portion of the output filter into the parallel interior filters. If your output filter is reduced down to a single file write, then other than experimenting with buffer size there is not much else you can do.
With TBB you might try changing your output filter to one that packs a larger I/O delivery buffer (you have a pool of these preallocated). i.e. change a larger number of small writes to a smaller number of larger writes. You would have to handle the last partial buffer write (possibly by passing a 0 length buffer through the output filter).
Jim Dempsey
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page