- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Why does copying just one, the first local array element to the Xeon Phi coprocessor (in Fortran) takes so much time(order of E-003 per invocation of the function) while for global arrays, this time is (order of E-005) for one element?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are you using OFFLOAD_REPORT to determine your times? You can find a description of the OFFLOAD_REPORT environment variable and examples of what the Offload Report contains in the User and Reference Guide for the Intel® Fortran Compiler at https://software.intel.com/en-us/intel-software-technical-documentation. If you set the OFFLOAD_REPORT environment variable to 3 before running your code, it will tell you how much time the host and the coprocessor used performing the offload, how much data was transferred and how much data space was allocated on the coprocessor for this transfer.
Without seeing your code or knowing how you performed your timing, I won't comment on your results other than to say that I suspect part of the problem is caused by the coprocessor needing to allocate space on the stack for the complete local array, even if you pass only one element. This is actually what you would want to happen; you want the shape of the array to match between the host and coprocessor. This space allocation occurs with every offload that uses that array. If you don't want to allocate all that space, you might want to look into the INTO modifier for offload data transfers, which basically does a copy from one variable on the host into a different variable on the coprocessor - very handy if you want to use only a single element of an array or use an array section.
For more information on when data space is allocated, you might want to check out the article: Effective Use of the Intel Compiler's Offload Features
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There were some issues related to global and local data structures. But they can't be avoided. Thanks a lot.
Where will the offload_report be generated?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The report goes to standard out.
![](/skins/images/D2683F18326913BBA0436CB7114DD569/responsive_peak/images/icon_anonymous_message.png)
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page