Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
2481 Discussions

Maybe a naive question: will TBB help in writing multiple files?

missing__zlw
Beginner
1,014 Views
This might be a general multithreading question. My program needs to write many files, They all have different file names.
Will parallelism help here? The same apply to read. But my read and write doesn't happen at the same time.

If using TBB, which algorithm will help, parallel_for or task?

Thanks.
0 Kudos
4 Replies
RafSchietekat
Valued Contributor III
1,014 Views
TBB is conceived to improve CPU utilisation, and does not currently deal well with blocking workloads, which includes file I/O.

You may invoke TBB from user threads that also do I/O, but doing I/O from within TBB tasks/algorithms may lead to underscription and thus a waste of CPU resources.
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,014 Views
Your question is general indeed. There are several unknown factors which will affect the answer(s).

Are all the files held on one hard disk with significant (~10ms) seek latency?
Are all the files held on one SSD disk with insignificant (~0ms) seek latency?
Does your disk controller/drive contain a significant buffer?
Is the file overhead mostly directory lookup/entry or read/write?

How much processingper file open?
How much processing per file read/write?

Does the quantity of data fit within the file system cache?
If it does then this mayproduce a situation where a benchmark test runs well but the application implimentation runs poor.

It sounds like you need to setup a simple test program with tuning knobs to adjust the characteristics.

Can you sketch your requirements?

How many files?
How much data?
Data being written?
Data being read?
Are reads sequential?
Are writes sequential?
Are reads random?
Are writes random?
Do order of writes follow order of reads?
Type of processing?
Is processing of eachrecord/buffer preferred by one thread or multiple threads?

Jim Dempsey
0 Kudos
SergeyKostrov
Valued Contributor II
1,014 Views
Hi,

I'd like to stay as generic as possible as well.

Yes, any form of multitasking will help but you need to pay attention to synchronization.


Best regards,
Sergey

PS:
Some time ago I worked on an image processing application. It was reading a big set of images,usually greater than 100,from an MS SQLdatabase. At the beginning the application was implemented as a single-threaded and because of this a user had to waita couple of minutesuntil last image is loaded. As soon a multi-threaded support was implemented a user was able to start working after a 1stimage was loaded, or with 10th image, as soon as it was loaded, and so on. All synchronization related problems I would describe as a "nightmare" and it took significant amount of time to resolve them...
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,014 Views
Sergey,

Multi-tasking of I/O can help or hinder. As to which depends on the nature of the I/O.
A better generalization is multi-tasking I/O of a single I/O thread with multiple computethreads will help.
File placement, latencies and disk buffering will often vary the best strategy.
When the entire file can be read with one read operation, then multi-tasking of the I/O thread(s) may be benificial. For large, many read or write files, reducing the number of threads performing I/O will reduce the seek latencies.

Jim Dempsey
0 Kudos
Reply