"Is there any other way to do it (ie via raw TBB tasks) ?"
Please first give an indication about whether you could still integrate both pipelines into one, because then each data packet will get its own task to take it from start to finish as quickly as possible, which should provide good locality.
(Correction) I seem to have overlooked a transition from "filters move past stationary data" (the tbb::pipeline approach) to "data moves through filters" (not directly supported), so concatenating two pipelines into one would not be the appropriate solution. Maybe I should sit this one out, though: I've done some tinkering with pipeline, but I'm not sure I've obtained a positive outcome, and nobody yet seemed interested to test it.
"It is my understanding (probably incorrect) that blocking a pipeline stage might stall it."
Blocking a serial filter lets new data back up after it, quickly stalling the pipeline, yes; blocking a parallel filter does not have that effect, but may, like any blocking in TBB, be detrimental to performance, because TBB's scheduler is not aware of any alternativescheduling opportunity.
I would use a concurrent_queue to get any data that must be saved to an independent tbb_thread, though, because you never know when one pipeline might steal work from another pipeline, causing them to get entangled, and while there has been some discussion about that, I have not seen any notification that this problem was solved. This problem occurs because TBB, conceived for finite jobs where fairness typically gets in the way of performance, is now being used for long-running jobs with at least global-progress concerns.