Intel® DevCloud
Help for those needing help starting or connecting to the Intel® DevCloud

Bottleneck in parallel I/O on DevCloud

Mohammad_Umair
Beginner
1,285 Views

I have developed an application that is parallelized using MPI and makes use of a parallel HDF5 library for I/O. The application writes several 3D arrays collectively and its performance is to a large extent I/O bound. While developing and testing the application on small computer systems, ex. i7 with 8 processors, I was getting a good scaling for I/O operations but when I switched to Devcloud, the I/O performance hits the wall dead. It's taking around 5-6 minutes to write a simple file of size 250 MB. Ideally, it should take a fraction of seconds (it took 0.01 s on i7 with 8 MPI processes). I tested my application only on 1 node using 24 MPI processes on DevCloud. I don't know the parallel file system which DevCloud uses in the background and what is the maximum block size it can handle. I performed the above-mentioned test on a small grid for which the file size is just 250 MB but during the production phase, the file size may go up to 3-5 GB and that would be a huge problem.

Please find the module that I'm using for parallel I/O in the attachment. I can't figure out whether it's because of something I'm not doing in a correct way or is it because of the parallel file system on the DevCloud. 

Any suggestions on how to improve the parallel I/O efficiency would be warmly welcomed.

Thanks and Regards,

Umair

0 Kudos
7 Replies
AbhishekD_Intel
Moderator
1,285 Views

Hi Umair,

I can understand the performance issue you can getting over DevCloud. One thing I wanted to add here is DevCloud is not built for handling compute-intensive applications or running high-end HPC applications. It's just an environment for testing and exploring other toolkits provided by Intel.

You will definitely face the performance issue on DevCloud even if you get the environment details, as the fabrics are not designed for such applications. You can still run your applications and obtain the outputs over all nodes.

 

Warm Regards,

Abhishek

 

0 Kudos
AbhishekD_Intel
Moderator
1,285 Views

Hi Umair,

Could you please confirm if your issue gets resolved?

 

Thank you

0 Kudos
Mohammad_Umair
Beginner
1,285 Views

Dear Abhishek,

The issue is still not resolved, parallel HDF5 I/O is still a huge bottleneck. I had to switch to ASCII with each process writing its own file. 

Regards,

Umair

 

 

0 Kudos
AbhishekD_Intel
Moderator
1,285 Views

Hi, 

We suggest that if possible you load the entire file into the main memory and then do the processing. Or alternately you can also use the "/tmp" directory to place your data. 

Note that the data may get lost from the /tmp directory since it is shared.

You will still face the performance issue on DevCloud as it is not designed for I/O intensive applications.

Can you please send us a small reproducer code for your scenario?

 

Warm Regards,

Abhishek

0 Kudos
AbhishekD_Intel
Moderator
1,276 Views

Hi,

Please give us a followup update so that we can close this thread accordingly.

 

Thank you

 

 

 

0 Kudos
AbhishekD_Intel
Moderator
1,245 Views

Hi Umair,

Could you please give us an update on your issue?

 

Thank you



0 Kudos
AbhishekD_Intel
Moderator
1,240 Views

Hi,

We are closing this issue, please fill free to post a new thread if you have any issues.


Warm Regards,

Abhishek


0 Kudos
Reply