Software Archive
Read-only legacy content
17061 Discussions

Fortran Arrays in CESM

Amlesh_K_
Beginner
353 Views

Why does copying just one, the first local array element to the Xeon Phi coprocessor (in Fortran) takes so much time(order of E-003 per invocation of the function) while for global arrays, this time is (order of E-005) for one element?

0 Kudos
3 Replies
Frances_R_Intel
Employee
353 Views

Are you using OFFLOAD_REPORT to determine your times? You can find a description of the OFFLOAD_REPORT environment variable and examples of what the Offload Report contains in the User and Reference Guide for the Intel® Fortran Compiler at https://software.intel.com/en-us/intel-software-technical-documentation. If you set the OFFLOAD_REPORT environment variable to 3 before running your code, it will tell you how much time the host and the coprocessor used performing the offload, how much data was transferred and how much data space was allocated on the coprocessor for this transfer.

Without seeing your code or knowing how you performed your timing, I won't comment on your results other than to say that I suspect part of the problem is caused by the coprocessor needing to allocate space on the stack for the complete local array, even if you pass only one element. This is actually what you would want to happen; you want the shape of the array to match between the host and coprocessor. This space allocation occurs with every offload that uses that array. If you don't want to allocate all that space, you might want to look into the INTO modifier for offload data transfers, which basically does a copy from one variable on the host into a different variable on the coprocessor - very handy if you want to use only a single element of an array or use an array section.

For more information on when data space is allocated, you might want to check out the article: Effective Use of the Intel Compiler's Offload Features

0 Kudos
Amlesh_K_
Beginner
353 Views

There were some issues related to global and local data structures. But they can't be avoided. Thanks a lot.

Where will the offload_report be generated?

0 Kudos
Frances_R_Intel
Employee
353 Views

The report goes to standard out.

0 Kudos
Reply