- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have developed an application that is parallelized using MPI and makes use of a parallel HDF5 library for I/O. The application writes several 3D arrays collectively and its performance is to a large extent I/O bound. While developing and testing the application on small computer systems, ex. i7 with 8 processors, I was getting a good scaling for I/O operations but when I switched to Devcloud, the I/O performance hits the wall dead. It's taking around 5-6 minutes to write a simple file of size 250 MB. Ideally, it should take a fraction of seconds (it took 0.01 s on i7 with 8 MPI processes). I tested my application only on 1 node using 24 MPI processes on DevCloud. I don't know the parallel file system which DevCloud uses in the background and what is the maximum block size it can handle. I performed the above-mentioned test on a small grid for which the file size is just 250 MB but during the production phase, the file size may go up to 3-5 GB and that would be a huge problem.
Please find the module that I'm using for parallel I/O in the attachment. I can't figure out whether it's because of something I'm not doing in a correct way or is it because of the parallel file system on the DevCloud.
Any suggestions on how to improve the parallel I/O efficiency would be warmly welcomed.
Thanks and Regards,
Umair
- Tags:
- Positioning
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Umair,
I can understand the performance issue you can getting over DevCloud. One thing I wanted to add here is DevCloud is not built for handling compute-intensive applications or running high-end HPC applications. It's just an environment for testing and exploring other toolkits provided by Intel.
You will definitely face the performance issue on DevCloud even if you get the environment details, as the fabrics are not designed for such applications. You can still run your applications and obtain the outputs over all nodes.
Warm Regards,
Abhishek
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Umair,
Could you please confirm if your issue gets resolved?
Thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Abhishek,
The issue is still not resolved, parallel HDF5 I/O is still a huge bottleneck. I had to switch to ASCII with each process writing its own file.
Regards,
Umair
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We suggest that if possible you load the entire file into the main memory and then do the processing. Or alternately you can also use the "/tmp" directory to place your data.
Note that the data may get lost from the /tmp directory since it is shared.
You will still face the performance issue on DevCloud as it is not designed for I/O intensive applications.
Can you please send us a small reproducer code for your scenario?
Warm Regards,
Abhishek
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Please give us a followup update so that we can close this thread accordingly.
Thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Umair,
Could you please give us an update on your issue?
Thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We are closing this issue, please fill free to post a new thread if you have any issues.
Warm Regards,
Abhishek

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page