- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am building a library which will simultaneously perform different compute intensive operations on a vector on the CPU side and GPU side using OpenMP and OpenCL. The problem is when I override a vector's allocator for proper alignment to enable zero-copy, the OpenMP performance suffers as the vector stops being optimized for sse and avx instructions. Hence, How to write a custom allocator for a stl vector such that it can be utilised both by OpenMP/ SSE/AVX2 for CPU side work and OpenCL / zero-copy for GPU side work ?
Link Copied
0 Replies

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page