- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This might be a general multithreading question. My program needs to write many files, They all have different file names.
Will parallelism help here? The same apply to read. But my read and write doesn't happen at the same time.
If using TBB, which algorithm will help, parallel_for or task?
Thanks.
Will parallelism help here? The same apply to read. But my read and write doesn't happen at the same time.
If using TBB, which algorithm will help, parallel_for or task?
Thanks.
Link Copied
4 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
TBB is conceived to improve CPU utilisation, and does not currently deal well with blocking workloads, which includes file I/O.
You may invoke TBB from user threads that also do I/O, but doing I/O from within TBB tasks/algorithms may lead to underscription and thus a waste of CPU resources.
You may invoke TBB from user threads that also do I/O, but doing I/O from within TBB tasks/algorithms may lead to underscription and thus a waste of CPU resources.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your question is general indeed. There are several unknown factors which will affect the answer(s).
Are all the files held on one hard disk with significant (~10ms) seek latency?
Are all the files held on one SSD disk with insignificant (~0ms) seek latency?
Does your disk controller/drive contain a significant buffer?
Is the file overhead mostly directory lookup/entry or read/write?
How much processingper file open?
How much processing per file read/write?
Does the quantity of data fit within the file system cache?
If it does then this mayproduce a situation where a benchmark test runs well but the application implimentation runs poor.
It sounds like you need to setup a simple test program with tuning knobs to adjust the characteristics.
Can you sketch your requirements?
How many files?
How much data?
Data being written?
Data being read?
Are reads sequential?
Are writes sequential?
Are reads random?
Are writes random?
Do order of writes follow order of reads?
Type of processing?
Is processing of eachrecord/buffer preferred by one thread or multiple threads?
Jim Dempsey
Are all the files held on one hard disk with significant (~10ms) seek latency?
Are all the files held on one SSD disk with insignificant (~0ms) seek latency?
Does your disk controller/drive contain a significant buffer?
Is the file overhead mostly directory lookup/entry or read/write?
How much processingper file open?
How much processing per file read/write?
Does the quantity of data fit within the file system cache?
If it does then this mayproduce a situation where a benchmark test runs well but the application implimentation runs poor.
It sounds like you need to setup a simple test program with tuning knobs to adjust the characteristics.
Can you sketch your requirements?
How many files?
How much data?
Data being written?
Data being read?
Are reads sequential?
Are writes sequential?
Are reads random?
Are writes random?
Do order of writes follow order of reads?
Type of processing?
Is processing of eachrecord/buffer preferred by one thread or multiple threads?
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'd like to stay as generic as possible as well.
Yes, any form of multitasking will help but you need to pay attention to synchronization.
Best regards,
Sergey
PS:
Some time ago I worked on an image processing application. It was reading a big set of images,usually greater than 100,from an MS SQLdatabase. At the beginning the application was implemented as a single-threaded and because of this a user had to waita couple of minutesuntil last image is loaded. As soon a multi-threaded support was implemented a user was able to start working after a 1stimage was loaded, or with 10th image, as soon as it was loaded, and so on. All synchronization related problems I would describe as a "nightmare" and it took significant amount of time to resolve them...
I'd like to stay as generic as possible as well.
Yes, any form of multitasking will help but you need to pay attention to synchronization.
Best regards,
Sergey
PS:
Some time ago I worked on an image processing application. It was reading a big set of images,usually greater than 100,from an MS SQLdatabase. At the beginning the application was implemented as a single-threaded and because of this a user had to waita couple of minutesuntil last image is loaded. As soon a multi-threaded support was implemented a user was able to start working after a 1stimage was loaded, or with 10th image, as soon as it was loaded, and so on. All synchronization related problems I would describe as a "nightmare" and it took significant amount of time to resolve them...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sergey,
Multi-tasking of I/O can help or hinder. As to which depends on the nature of the I/O.
A better generalization is multi-tasking I/O of a single I/O thread with multiple computethreads will help.
File placement, latencies and disk buffering will often vary the best strategy.
When the entire file can be read with one read operation, then multi-tasking of the I/O thread(s) may be benificial. For large, many read or write files, reducing the number of threads performing I/O will reduce the seek latencies.
Jim Dempsey
Multi-tasking of I/O can help or hinder. As to which depends on the nature of the I/O.
A better generalization is multi-tasking I/O of a single I/O thread with multiple computethreads will help.
File placement, latencies and disk buffering will often vary the best strategy.
When the entire file can be read with one read operation, then multi-tasking of the I/O thread(s) may be benificial. For large, many read or write files, reducing the number of threads performing I/O will reduce the seek latencies.
Jim Dempsey
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page