- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have a question concerning the usage of structs. My current kernel accesses two buffers using a struct in the following way:
struct pair { float first; float second; }; inline const float f(const struct pair param) { return param.first * param.second; } inline const struct pair access_func(__global float const * const a, __global float const * const b, const int i) { struct pair res = { a, b }; return res; } // slow __kernel ...(__global float const * const a, __global float const * const b) { // ... x = f( access_func( a, b, i ) ); // ... }
When I alter the kernel in the following way it runs much faster:
// fast __kernel ...(__global float const * const a, __global float const * const b) { // ... x = a * b[ i ]; // ... }
Is there a way to let the compiler do this optimization? The NVIDIA compiler seems to be able to do this, since I don't see a difference in runtime on a GPU.
Thanks in advance!
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As I'm understanding your code, the issue seems to be at least partially about the access function. Could we summarize your request as that you are looking for better inlining of address calculations instead of executing access_func for each work item?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jeffrey M. (Intel) wrote:
Could we summarize your request as that you are looking for better inlining of address calculations instead of executing access_func for each work item?
Yes that is correct. The code has to be written this way, because it is generated automatically. As I mentioned in my first post, the NVIDIA compiler is able to do the optimization. Maybe the optimization can be supported by additional keywords?

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page