Solved: you can try to check tbb

Sensei_S_ · ‎10-15-2015

Dear all,

I need to use a legacy data structure that is not thread safe, let's call it LDS. Basically, I have a textual file and for each line I send it to LDS, and it updates itself. This data can be reduced (it's even commutative), so I'd like to parallelize it.

A parallel pipeline I think is unsuitable, it would need a single LDS. I can also access the lines in parallel, but as far as I understand a parallel for would give a performance loss, since I need to access the file, and I/O intensive tasks shouldn't be performed in parallel, but on this point I hope I'm mistaken! Also, a parallel for will spawn threads and I don't know how I should handle this case (an ex-novo LDS, should LDS be split?).

Can you suggest me how to correctly parallelize it?

Thanks!

RafSchietekat · ‎10-17-2015

#1 "I need to use a legacy data structure that is not thread safe, let's call it LDS." "A parallel pipeline I think is unsuitable, it would need a single LDS."

It still has to be reentrant, otherwise this is not something TBB can help you with, and you should instead have separate processes running in parallel. TBB can only help you if you can have multiple instances/contexts, even if each of them must only be accessed by a single thread at a time. If that is the case, read on.

You could have the LDS in tbb::enumerable_thread_specific (thread-local storage or TLS), even in a parallel processing stage. Note that you can have the TLS variable outside the pipeline, and pass a reference to the parallel stage, so enumerating over the thread-specific values afterwards is not a problem. I'm assuming that, even if the LDS doesn't know how to reduce between instances of itself, you still know how to do that with its output data.

As for the input stage (you probably won't get efficient random access from a legacy archive even if the underlying medium would support it), you could either accept getting blocked in one thread at a time (given enough threads), or you could cheat and tell TBB to use an extra thread (using task_scheduler_init), or you could launch a separate thread to do prefetching into a queue and read from that (in which case you can make do with parallel_do()).

View solution in original post

Vladimir_P_1234567890 · ‎10-15-2015

you can try to check tbb:parallel_reduce()

https://software.intel.com/en-us/android/articles/calculation-of-pi-with-intel-threading-building-blocks

--Vladimir

jimdempseyatthecove · ‎10-15-2015

Can you describe the operations performed? For example you mention lines in the file.

Are each line independent of each other line?

Need the results of each line to be in the same order as the input of each line?

If each line is not independent of each other, but interact with each other, can the interaction be constructed similar to N-Body problems?

We need more information on your problem in order to offer suggestions.

Jim Dempsey

Sensei_S_ · ‎10-15-2015

Dear all,

I can safely assume that each line is independent of the others, and the operations on LDS are commutative and associative, so I can discard any ordering.

How can a parallel reduce performed on a file?

The legacy library for accessing the file (it is composed by lines, accessible independently, but it's a binary file) does not support iterators, it only supports a void ldsarchive::read(std::size_t line) operation, and a bool ldsarchive::read(), which reads a line and returns false when EOF is reached.

Thanks!

jimdempseyatthecove · ‎10-16-2015

Assuming that "the operations on LDS are commutative and associative" within each line (intra-line), then this sounds like a job for parallel pipeline. The input stage (sequential) only reads a line into the token buffer (when available), the input stage then passes the token (line buffer) to a parallel stage for processing (you may or may not need additional stages, and/or and output stage).

If "the operations on LDS are commutative and associative" across lines (inter-line), then parallelization is a bit more complex, but not necessarily impossible. We would need more information about the program in order to make meaningful suggestions.

Jim Dempsey

RafSchietekat · ‎10-17-2015

#1 "I need to use a legacy data structure that is not thread safe, let's call it LDS." "A parallel pipeline I think is unsuitable, it would need a single LDS."

It still has to be reentrant, otherwise this is not something TBB can help you with, and you should instead have separate processes running in parallel. TBB can only help you if you can have multiple instances/contexts, even if each of them must only be accessed by a single thread at a time. If that is the case, read on.

You could have the LDS in tbb::enumerable_thread_specific (thread-local storage or TLS), even in a parallel processing stage. Note that you can have the TLS variable outside the pipeline, and pass a reference to the parallel stage, so enumerating over the thread-specific values afterwards is not a problem. I'm assuming that, even if the LDS doesn't know how to reduce between instances of itself, you still know how to do that with its output data.

As for the input stage (you probably won't get efficient random access from a legacy archive even if the underlying medium would support it), you could either accept getting blocked in one thread at a time (given enough threads), or you could cheat and tell TBB to use an extra thread (using task_scheduler_init), or you could launch a separate thread to do prefetching into a queue and read from that (in which case you can make do with parallel_do()).

Sensei_S_ · ‎10-29-2015

Thank you all for your precious suggestions!

Reduce with non-thread safe data