GPU Compute Software
Ask questions about Intel® Graphics Compute software technologies, such as OpenCL* GPU driver and oneAPI Level Zero
222 Discussions

SYCL device info free_memory wrong on 2-stack PVC1550 GPU

JakubH
Novice
156 Views

Hello,

in short, in SYCL, when querrying the amount of free memory on a GPU using the function

 

dev.get_info<sycl::ext::intel::info::device::free_memory>()

 

it is reported wrong.
 
The function always reports the amount of free memory on the first stack in the 2-stack GPU, regardless if the sycl::device corresponds to the first or the second stack.
 
I use

 

export ZE_FLAT_DEVICE_HIERARCHY="FLAT"

 

so the 2-stack GPU corresponds to two sycl::devices available to the user, one for each stack.
 
I use PVC1550 GPUs on a private instance on the Tiber devcloud.
icpx version 2025.0.1
I see the issue with lower icpx versions too (you might need to use `export ZES_ENABLE_SYSMAN=1` with the lower versions to enable the free memory querry)
 
Reproducer code:

 

#include <cstdio>
#include <cstdlib>
#include <vector>
#include <sycl/sycl.hpp>

int main(int argc, char ** argv)
{
    if(argc <= 1) throw std::runtime_error("not enough arguments");
    int gpu_idx = atoi(argv[1]);

    std::vector<sycl::device> gpus_all = sycl::device::get_devices(sycl::info::device_type::gpu);
    std::vector<sycl::device> gpus_levelzero;
    for(sycl::device & gpu : gpus_all)
    {
        if(gpu.get_backend() == sycl::backend::ext_oneapi_level_zero)
        {
            gpus_levelzero.push_back(gpu);
        }
    }

    printf("There are %zu levelzero GPUs\n", gpus_levelzero.size());
    for(size_t i = 0; i < gpus_levelzero.size(); i++)
    {
        sycl::device & d = gpus_levelzero[i];
        size_t mem_capacity = d.get_info<sycl::info::device::global_mem_size>();
        size_t mem_free = d.get_info<sycl::ext::intel::info::device::free_memory>();
        printf("  GPU %2zu: capacity = %12zu B = %6zu MiB, free = %12zu B = %6zu MiB\n", i, mem_capacity, mem_capacity >> 20, mem_free, mem_free >> 20);
    }

    sycl::queue q(gpus_levelzero[gpu_idx]);

    size_t allocsize = (size_t{60} << 30);
    void * ptr = sycl::malloc_device(allocsize, q);
    printf("Allocated on GPU %2d: %zu B = %zu MiB, ptr = %p\n", gpu_idx, allocsize, allocsize >> 20, ptr);

    printf("Current free memory:\n");
    for(size_t i = 0; i < gpus_levelzero.size(); i++)
    {
        sycl::device & d = gpus_levelzero[i];
        size_t mem_free = d.get_info<sycl::ext::intel::info::device::free_memory>();
        printf("  GPU %2zu: free = %12zu B = %6zu MiB\n", i, mem_free, mem_free >> 20);
    }

    sycl::free(ptr, q);
    printf("Memory freed\n");

    printf("Current free memory:\n");
    for(size_t i = 0; i < gpus_levelzero.size(); i++)
    {
        sycl::device & d = gpus_levelzero[i];
        size_t mem_free = d.get_info<sycl::ext::intel::info::device::free_memory>();
        printf("  GPU %2zu: free = %12zu B = %6zu MiB\n", i, mem_free, mem_free >> 20);
    }

    return 0;
}

 

It finds all level zero GPU devices and reports their memory capacity and free memory. It then allocates 60 GiB of memory on a given GPU. Then it again prints the amount of free memory on each GPU. At the end it frees the memory and again reports the free memory on each GPU.
 
Compile with

 

icpx -fsycl source.cpp -o program.x

 

and run as

 

./program.x <device_index_where_to_allocate>

 

 

output with `./program.x 0` (or any other even number lower than number of gpus) (shortened):

 

There are 16 levelzero GPUs
  GPU  0: capacity =  68719476736 B =  65536 MiB, free =  68673966080 B =  65492 MiB
  GPU  1: capacity =  68719476736 B =  65536 MiB, free =  68673966080 B =  65492 MiB
  GPU  2: capacity =  68719476736 B =  65536 MiB, free =  68673970176 B =  65492 MiB
  GPU  3: capacity =  68719476736 B =  65536 MiB, free =  68673970176 B =  65492 MiB
 ...
Allocated on GPU  0: 64424509440 B = 61440 MiB, ptr = 0xff00000000200000
Current free memory:
  GPU  0: free =   4119027712 B =   3928 MiB
  GPU  1: free =   4119027712 B =   3928 MiB
  GPU  2: free =  68547817472 B =  65372 MiB
  GPU  3: free =  68547817472 B =  65372 MiB
 ...
Memory freed
Current free memory:
  GPU  0: free =   4119089152 B =   3928 MiB
  GPU  1: free =   4119142400 B =   3928 MiB
  GPU  2: free =  68548362240 B =  65372 MiB
  GPU  3: free =  68548427776 B =  65372 MiB
 ...

 

You see, I allocated 60 GiB only on GPU 0, but GPU 1 also reports lower free memory.
 
output with `./program.x 1` (or any other odd number lower than number of gpus) (shortened):

 

There are 16 levelzero GPUs
  GPU  0: capacity =  68719476736 B =  65536 MiB, free =  68673966080 B =  65492 MiB
  GPU  1: capacity =  68719476736 B =  65536 MiB, free =  68673966080 B =  65492 MiB
  GPU  2: capacity =  68719476736 B =  65536 MiB, free =  68673974272 B =  65492 MiB
  GPU  3: capacity =  68719476736 B =  65536 MiB, free =  68673974272 B =  65492 MiB
 ...
Allocated on GPU  1: 64424509440 B = 61440 MiB, ptr = 0xff00000000200000
Current free memory:
  GPU  0: free =  68543537152 B =  65368 MiB
  GPU  1: free =  68543537152 B =  65368 MiB
  GPU  2: free =  68547821568 B =  65372 MiB
  GPU  3: free =  68547821568 B =  65372 MiB
 ...
Memory freed
Current free memory:
  GPU  0: free =  68543533056 B =  65368 MiB
  GPU  1: free =  68543533056 B =  65368 MiB
  GPU  2: free =  68548452352 B =  65372 MiB
  GPU  3: free =  68548493312 B =  65372 MiB
 ...

 

Here, I allocated 60 GiB on GPU 1, but the free memory function reports that almost all the memory is still free.
 
Furthermore (this is probably a separate issue), after I free the memory using sycl::free, querrying the free device memory still treats is as allocated -- see the last group of free memory reports in the outputs.
 
 
Am I doing something wrong? Is this expected on the 2-stack GPU? Can this be fixed?
Thanks,
Jakub
Labels (1)
0 Kudos
0 Replies
Reply