- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'm trying to evaluate some larger workloads on my Intel HD5500 GPU. I was under the impression that the GPU shared the host memory, but I am seeing that the max global memory is reported to be quite a bit smaller than my host memory. That is, when I query CL_DEVICE_GLOBAL_MEM_SIZE, I see these values for my CPU and GPU:
Intel(R) HD Graphics 5500
max global memory size: 8530714624
Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
max global memory size: 21347758080
I'm running Windows 10 with driver version: 20.19.15.4835
Is it possible at all to allow the GPU to use a little more of the host memory? We're right on the cusp of being able to process our dataset on the GPU.
I'm using CL_MEM_USE_HOST_PTR and the clCreateBuffer call is failing with -5 when given an aligned host pointer to a large region of memory.
Many thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
On Windows Operating System we are limited by residency model to only have 1/2 of memory resident at a given time for a process.
It means that single Kernel cannot utilize more then that.
Driver also multiply that by 0.8 to make space for internal allocations.
On Linux there are no limitations like that as there is different memory residency model there.
You should see 0.8 * memory size there.
Driver doesn't count how much was allocated at given point of time, so you have high chance to allocate more then global memory size.
Driver will also try to submit your fat Kernel , if it fail you will get error during enqueue/flush/finish calls.
If you use this memory in multiple kernels , driver will spit batch buffers to multiple chunks, each containing a pack limited by resource size.
Driver also has special mode for users which play around memory limits.
You may browse Neo sources for "isMemoryBudgetExhausted". This will start to trigger implicit flushes to minimize the amount of expensive memory residency operations.
Please let me know if you have any further questions.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
On Windows Operating System we are limited by residency model to only have 1/2 of memory resident at a given time for a process.
It means that single Kernel cannot utilize more then that.
Driver also multiply that by 0.8 to make space for internal allocations.
On Linux there are no limitations like that as there is different memory residency model there.
You should see 0.8 * memory size there.
Driver doesn't count how much was allocated at given point of time, so you have high chance to allocate more then global memory size.
Driver will also try to submit your fat Kernel , if it fail you will get error during enqueue/flush/finish calls.
If you use this memory in multiple kernels , driver will spit batch buffers to multiple chunks, each containing a pack limited by resource size.
Driver also has special mode for users which play around memory limits.
You may browse Neo sources for "isMemoryBudgetExhausted". This will start to trigger implicit flushes to minimize the amount of expensive memory residency operations.
Please let me know if you have any further questions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
"I'm using CL_MEM_USE_HOST_PTR and the clCreateBuffer call is failing with -5 when given an aligned host pointer to a large region of memory."
This possibly means that driver creates a copy here.
Copy is created when pointer or size are not aligned to cacheline size.
Can you try to align those to 64 bytes ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Many thanks Michal!
One last related question: I am seeing that the max memory allocation size is ~2.1 GB on my windows machine. Will that also go up if I switch to Linux?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes this will also scale up, to a current maximum of ~4GB.
There is a separate thing here, Surface State accesses has a hard limit of 4GB, hence we cannot have allocations larger then this size as default.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I can confirm that I booted up Linux and now I can access more memory and allocate chunks of 4 GB. This allowed us to scale up pretty significantly.
Thanks very much for the help Michal
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page