- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In one of the sample apps, there is an align_malloc method.
Inside, there is this assert:
assert(size/sizeof(void*)*sizeof(void*) == size);
Why must the memory size be divisible by sizeof(void*) ?
Thanks,
Aaron
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
From the code snippet looks like the intention is to check and make sure "size" is aligned to the sizeof(void *).
Thanks,
Raghu
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, Raghu. This method is called on the size of the host memory buffer before calling
clCreateBuffer.
So, my question is: does the host memory have to be alligned to sizeof(void*)
before passing it into clCreateBuffer ? I have a 64 bit system with sizeof(void*) equal to 8.
Can I pass a buffer of size 14 into clCreateBuffer? Is there a penalty if I do?
Thanks,
Aaron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In this case it looks like it is a requirement (someone from the Xeon PHI team can correct me if I am wrong), but most of the times alignment is needed for performance reasons. You will get better performance if the data is aligned to, say, a cache line for example. On HD graphics you will get better performance if the buffer is aligned to a cache line and best performance if its aligned to a page boundary.
You can find it the hard way. If your buffer is not aligned to sizeof(void *) and you get a crash in your application then you have to make sure this requirement is met. Otherwise it is for performance reasons.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
From the Xeon Phi prospective you will get acceptable performance when buffers are aligned to 64 bytes. To get the best possible performance please align your buffers to 4K (standard x86 memory page). The same is right also for sub-buffers and Read/Write/Copy operations - if offsets are aligned properly the data transfer bandwidth is much higher.
.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page