Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

Type safety in pipeline

uganson
Beginner
394 Views

Hello!

I am a recent user of Intel's TBB (about 6 months). I don't know if this issue has been discussed before (I searched on the forums), or if it is a design decision.

The current implementation in TBB's pipeline and filter classes is not type safe. The operator() receives a void* that has to be cast to the actual data type of the token passed between pipeline stages.

I propose a templated filter implementation with explicit definition of input and output types. The tfilter class looks like:

template

class tfilter {

...

virtual Output* operator() (Input*);

};

The operator>> is overloaded to produce compositions of tfilters, and allow a simple syntax to define the pipeline. This composition is produced at compile time using templates, providing compile time type safety.

The pipeline execution looks like:

pipeline_run(numTokens, InputFilter() >> ProcessFilter() >> OutputFilter());

I wrote an initial working implementation of this idea. There are some issues that have yet to be addressed, but I want to have some feedback. Suggestions, improvements, criticism are welcome.

I attach the tpipeline.h file. I also wrote an extended explanation, rationale and example usage in http://www.nf.ull.es/software/tpipeline.

0 Kudos
3 Replies
Alexey-Kukanov
Employee
394 Views

The idea is interesting. There is, however, one issue with your code -it depends on the lifetime of temporary objects, which can be referenced in PipelineNode class. Therefore, the implementation is only safe iff operator>> is used in place of the second parameter to pipeline_run, as you suggested. The bad thing is that it is easy to use in a different way, e.g. like this:

tbb::tfilter filter_set =
        InputFilter() >> ProcessFilter() >> OutputFilter();
pipeline_run(numTokens, filter_set);

and this can cause a rather hard-to-debug runtime failure while all compile-time checks are passed. Moreover, once the operator>> is introduced I believe some users will expect that partial construction of a filter set is also possible, e.g. like this:

tbb::tfilter temp =
        myInputFilter >> myProcessFilter >> myProcessFilter2 ;
if( some_complex_condition )
    pipeline_run(numTokens, temp >> myOutputToScreen);
else
    pipeline_run(numTokens, temp >> myOutputToFile);

Note that noone would expect anything bad there, as all filters aren't temporary objects; however a temporary object is created as the result of the first operator>> and its lifetime ended before it is used by reference in insertIntoPipeline method called from pipeline_run.

Nevertheless, as I said, the idea is interesting, and both pipeline building syntax and compile-time type checks are quite appealing. So I would ask you to contribute it to TBB. And, of course, if you feel you can elaborate it further to eliminate the shortcomings, that would be great! :)

0 Kudos
ARCH_R_Intel
Employee
394 Views

I'm a fan of type safety, and so when I designed the original pipeline class, I thought about making it typesafe. Unfortunately, the type system of C++ seemed to be too weak. In particular, in some situations users would want tochoosea particular pipeline sequence at run time. That requires the ability to leave a pipeline in a partially constructed state, as Alexey points out.

However, I think there is a way to adapt uganson's proposal so that pipelines can be left in a partially constructed state. Introduce a new type, call it P for argument's sake. (A better name would be appreciated). Let P represent an abstract pointer to the end of a pipeline whose last stage returns a T*. Let P represent a pointer to an emptypipeline. Define "P << tfilter" as appending the tfilter and returning a P that abstractly points to the new end of the pipeline.

Then for a short fixed-length pipeline with stages A, B, and C we can write:

myPipeline.makeP() << A << B << C;

But users wanting to dynamically construct a pipeline could still do so. Here's an example:

tfilter<

void,Frog> myInputStage;

tfilter myFrogToFrogStages[...];

tfilter myFrogToPrinceStage;

tfiltervoid> myOutputStage;

P<

void> p1 = myPipeline.makeP();

P p2 = p1 << myInputStage;

for

(...) {

p2 = p2 << myFrogToFrogStages[i++];

}

P p3 = p2 << myFrogToPrinceStage; // KISS principle :-)

p3 << myOutputStage;

With the forthcoming C++ 200x "auto" syntax, this would be easier to use, because the declaration of p1, p2, and p3would reduce to "auto p1=..." etc.

- Arch

0 Kudos
uganson
Beginner
394 Views

Thanks for your reply!

Yes, I see the problem with the temporary objects...

As I see it, there are two kind of temporary objects introduced here. One of them are the temporary PipelineNode objects created in the several calls to operator>>. This class is hidden from the user, and not really needed for pipeline construction. Just only a mechanism for operator>> concatenation. If the pipeline is somehow constructed while this objects are alive, for example a simple linked list of filter* can be kept between all the tfilter instances in the pipeline, instead of the full tree. I'll try to implement a prototype of this idea and update the discussion here.

The other kind of temporary objects are the actual filters, descendant from tfilter, that are constructed inside the expression. This is the case of your first example. I think that the only solution to this would be declaring the filter objects before the construction, as in your second example, instead of calling the constructors in place, e.g.:

InputFilter f1;
ProcessFilter f2;
OutputFilter f3;
tbb::tfilter filter_set = f1 >> f2 >> f3;
pipeline_run(numTokens, filter_set);

The same problem can arise with original TBB's pipeline, if trying to do pipeline.add_filter(SomeFilter()) when adding a new filter. After the call to add_filter, the temporary object constructed by SomeFilter() is destroyed, so when pipeline.run() is called, the filters no longer exist.

I think this is a matter of explaining this risks of misuse in the class documentation, and perhaps adding some runtime checks (in debug mode), trying to catch the usage of already destroyed objects when the call to pipeline_run is done. I don't know if there is a way to catch this at compile time.

The underlying problem is that there is no way to make a copy of these temporary objects, and then pass them by value instead of reference. That would need some sort of virtual copy constructor mechanism that C++ doesnt provide :(

About dynamically creating the pipelines, that's an use case I haven't though of. I willthink of a way to implement the idea you propose of using this half-constructed pipeline object. I'll post the update here.

Anyway, perhaps for the more complicated use cases, the only way of doing it would be to use the standard TBB's pipeline and filter classes. After all, if using the type safe wrapper adds to much burden to the code, then the advantages of having type safety are lost by having a much difficult to maintain code.

0 Kudos
Reply