- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a basic question. Suppose I offload the following three items ASYNCHRONOUSLY to the same mic device from the same thread on the host.
1. Offload a bunch of data tied to a pointer v at the host (in clause)
2. Offload a function call one of whose arguments is the pointer v
3. Offload a data output from the mic device to the pointer v (out clause)
Is it correct to assume that the mic device does not start running the function in #2 until the data input to it in #1 is complete? Is it correct to assume that the mic device does not do the data output in #3 until the function call in #2 is complete?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When you offload asynchronously you should not depend on previous offload completing before the current. You need to use signal/wait to control the order.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you --- that is what I expected.
The Intel compiler manual says the following about signal (under #pragma offload signal as well as #pragma offload_transfer signal):
- signal
-
An optional integer expression that serves as a handle on an asynchronous data transfer or computational activity. The computation performed by theoffload clause and any results returned from the offload using out clauses occurs concurrently with CPU execution of the code after the pragma. If this clause is not used, then the entire offload and associated data transfer are executed synchronously. The CPU will not continue past the pragma until it has completed.
This clause refers to a specific target device so you must specify a target-number in the target clause that is greater than or equal to zero.
Why does the documentation not refer to the in clauses as opposed to out? Can we assume that in clauses inputting data from host to device also occur concurrently?
I assume that the async transfers are done using DMA. If so, memory on the host needs to be pinned or registered to prevent memory from being paged out. At what point is the memory pinned and then unpinned?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi divakar,
Signal can be used for transfering data asynchronously from the device to the host as well. Please refer to "About Asynchronous Data Trannsfer" (http://software.intel.com/en-us/node/459120). In this section, code sample demonstrates how to transfer data to and from the coprocessor asynchronously. Thank you.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page