Community
cancel
Showing results for 
Search instead for 
Did you mean: 
missing__zlw
Beginner
76 Views

Maybe a naive question: will TBB help in writing multiple files?

This might be a general multithreading question. My program needs to write many files, They all have different file names.
Will parallelism help here? The same apply to read. But my read and write doesn't happen at the same time.

If using TBB, which algorithm will help, parallel_for or task?

Thanks.
0 Kudos
4 Replies
RafSchietekat
Black Belt
76 Views

TBB is conceived to improve CPU utilisation, and does not currently deal well with blocking workloads, which includes file I/O.

You may invoke TBB from user threads that also do I/O, but doing I/O from within TBB tasks/algorithms may lead to underscription and thus a waste of CPU resources.
jimdempseyatthecove
Black Belt
76 Views

Your question is general indeed. There are several unknown factors which will affect the answer(s).

Are all the files held on one hard disk with significant (~10ms) seek latency?
Are all the files held on one SSD disk with insignificant (~0ms) seek latency?
Does your disk controller/drive contain a significant buffer?
Is the file overhead mostly directory lookup/entry or read/write?

How much processingper file open?
How much processing per file read/write?

Does the quantity of data fit within the file system cache?
If it does then this mayproduce a situation where a benchmark test runs well but the application implimentation runs poor.

It sounds like you need to setup a simple test program with tuning knobs to adjust the characteristics.

Can you sketch your requirements?

How many files?
How much data?
Data being written?
Data being read?
Are reads sequential?
Are writes sequential?
Are reads random?
Are writes random?
Do order of writes follow order of reads?
Type of processing?
Is processing of eachrecord/buffer preferred by one thread or multiple threads?

Jim Dempsey
SergeyKostrov
Valued Contributor II
76 Views

Hi,

I'd like to stay as generic as possible as well.

Yes, any form of multitasking will help but you need to pay attention to synchronization.


Best regards,
Sergey

PS:
Some time ago I worked on an image processing application. It was reading a big set of images,usually greater than 100,from an MS SQLdatabase. At the beginning the application was implemented as a single-threaded and because of this a user had to waita couple of minutesuntil last image is loaded. As soon a multi-threaded support was implemented a user was able to start working after a 1stimage was loaded, or with 10th image, as soon as it was loaded, and so on. All synchronization related problems I would describe as a "nightmare" and it took significant amount of time to resolve them...
jimdempseyatthecove
Black Belt
76 Views

Sergey,

Multi-tasking of I/O can help or hinder. As to which depends on the nature of the I/O.
A better generalization is multi-tasking I/O of a single I/O thread with multiple computethreads will help.
File placement, latencies and disk buffering will often vary the best strategy.
When the entire file can be read with one read operation, then multi-tasking of the I/O thread(s) may be benificial. For large, many read or write files, reducing the number of threads performing I/O will reduce the seek latencies.

Jim Dempsey
Reply