hidden text to trigger early load of fonts ПродукцияПродукцияПродукцияПродукция Các sản phẩmCác sản phẩmCác sản phẩmCác sản phẩm المنتجاتالمنتجاتالمنتجاتالمنتجات מוצריםמוצריםמוצריםמוצרים
OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1722 Discussions

Maximum global memory for HD5500

Sorensen__Tyler
Beginner
1,829 Views

Hi,

I'm trying to evaluate some larger workloads on my Intel HD5500 GPU. I was under the impression that the GPU shared the host memory, but I am seeing that the max global memory is reported to be quite a bit smaller than my host memory. That is, when I query CL_DEVICE_GLOBAL_MEM_SIZE, I see these values for my CPU and GPU:

Intel(R) HD Graphics 5500
max global memory size: 8530714624

Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
max global memory size: 21347758080
 

I'm running Windows 10 with driver version: 20.19.15.4835

Is it possible at all to allow the GPU to use a little more of the host memory? We're right on the cusp of being able to process our dataset on the GPU.

I'm using CL_MEM_USE_HOST_PTR and the clCreateBuffer call is failing with -5 when given an aligned host pointer to a large region of memory.

Many thanks!

0 Kudos
1 Solution
Michal_M_Intel
Employee
1,829 Views

Hello,

On Windows Operating System we are limited by residency model to only have 1/2 of memory resident at a given time for a process.

It means that single Kernel cannot utilize more then that.

Driver also multiply that by 0.8 to make space for internal allocations.

 

On Linux there are no limitations like that as there is different memory residency model there. 

You should see 0.8 * memory size there.

 

Driver doesn't count how much was allocated at given point of time, so you have high chance to allocate more then global memory size.

Driver will also try to submit your fat Kernel , if it fail you will get error during enqueue/flush/finish calls.

If you use this memory in multiple kernels , driver will spit batch buffers to multiple chunks, each containing a pack limited by resource size.

Driver also has special mode for users which play around memory limits.

You may browse Neo sources for "isMemoryBudgetExhausted". This will start to trigger implicit flushes to minimize the amount of expensive memory residency operations.

Please let me know if you have any further questions.

View solution in original post

0 Kudos
5 Replies
Michal_M_Intel
Employee
1,830 Views

Hello,

On Windows Operating System we are limited by residency model to only have 1/2 of memory resident at a given time for a process.

It means that single Kernel cannot utilize more then that.

Driver also multiply that by 0.8 to make space for internal allocations.

 

On Linux there are no limitations like that as there is different memory residency model there. 

You should see 0.8 * memory size there.

 

Driver doesn't count how much was allocated at given point of time, so you have high chance to allocate more then global memory size.

Driver will also try to submit your fat Kernel , if it fail you will get error during enqueue/flush/finish calls.

If you use this memory in multiple kernels , driver will spit batch buffers to multiple chunks, each containing a pack limited by resource size.

Driver also has special mode for users which play around memory limits.

You may browse Neo sources for "isMemoryBudgetExhausted". This will start to trigger implicit flushes to minimize the amount of expensive memory residency operations.

Please let me know if you have any further questions.

0 Kudos
Michal_M_Intel
Employee
1,829 Views

"I'm using CL_MEM_USE_HOST_PTR and the clCreateBuffer call is failing with -5 when given an aligned host pointer to a large region of memory."

This possibly means that driver creates a copy here.

Copy is created when pointer or size are not aligned to cacheline size.

Can you try to align those to 64 bytes ?

0 Kudos
Sorensen__Tyler
Beginner
1,829 Views

Many thanks Michal! 

One last related question: I am seeing that the max memory allocation size is ~2.1 GB on my windows machine. Will that also go up if I switch to Linux?

 

0 Kudos
Michal_M_Intel
Employee
1,829 Views

Yes this will also scale up, to a current maximum of ~4GB.

There is a separate thing here, Surface State accesses has a hard limit of 4GB, hence we cannot have allocations larger then this size as default.

0 Kudos
Sorensen__Tyler
Beginner
1,829 Views

I can confirm that I booted up Linux and now I can access more memory and allocate chunks of 4 GB. This allowed us to scale up pretty significantly. 

Thanks very much for the help Michal

0 Kudos
Reply