<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: SYCL device info free_memory wrong on 2-stack PVC1550 GPU in GPU Compute Software</title>
    <link>https://community.intel.com/t5/GPU-Compute-Software/SYCL-device-info-free-memory-wrong-on-2-stack-PVC1550-GPU/m-p/1716759#M2078</link>
    <description>&lt;P&gt;I report that I still observe the same incorrect behavior with version 2025.2.1.&lt;/P&gt;</description>
    <pubDate>Sat, 13 Sep 2025 10:34:56 GMT</pubDate>
    <dc:creator>JakubH</dc:creator>
    <dc:date>2025-09-13T10:34:56Z</dc:date>
    <item>
      <title>SYCL device info free_memory wrong on 2-stack PVC1550 GPU</title>
      <link>https://community.intel.com/t5/GPU-Compute-Software/SYCL-device-info-free-memory-wrong-on-2-stack-PVC1550-GPU/m-p/1645347#M1655</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;in short, in SYCL, when querrying the amount of free memory on a GPU using the function&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="cpp"&gt;dev.get_info&amp;lt;sycl::ext::intel::info::device::free_memory&amp;gt;()&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;SPAN&gt;it is reported wrong.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;The function always reports the amount of free memory on the first stack in the 2-stack GPU, regardless if the sycl::device corresponds to the first or the second stack.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;I use&lt;/SPAN&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;export ZE_FLAT_DEVICE_HIERARCHY="FLAT"&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;so the 2-stack GPU corresponds to two sycl::devices available to the user, one for each stack.&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;I use PVC1550 GPUs on a private instance on the Tiber devcloud.&lt;/DIV&gt;&lt;DIV&gt;icpx version&amp;nbsp;&lt;SPAN&gt;2025.0.1&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;I see the issue with lower icpx versions too (you might need to use `export ZES_ENABLE_SYSMAN=1` with the lower versions to enable the free memory querry)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Reproducer code:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="cpp"&gt;#include &amp;lt;cstdio&amp;gt;
#include &amp;lt;cstdlib&amp;gt;
#include &amp;lt;vector&amp;gt;
#include &amp;lt;sycl/sycl.hpp&amp;gt;

int main(int argc, char ** argv)
{
    if(argc &amp;lt;= 1) throw std::runtime_error("not enough arguments");
    int gpu_idx = atoi(argv[1]);

    std::vector&amp;lt;sycl::device&amp;gt; gpus_all = sycl::device::get_devices(sycl::info::device_type::gpu);
    std::vector&amp;lt;sycl::device&amp;gt; gpus_levelzero;
    for(sycl::device &amp;amp; gpu : gpus_all)
    {
        if(gpu.get_backend() == sycl::backend::ext_oneapi_level_zero)
        {
            gpus_levelzero.push_back(gpu);
        }
    }

    printf("There are %zu levelzero GPUs\n", gpus_levelzero.size());
    for(size_t i = 0; i &amp;lt; gpus_levelzero.size(); i++)
    {
        sycl::device &amp;amp; d = gpus_levelzero[i];
        size_t mem_capacity = d.get_info&amp;lt;sycl::info::device::global_mem_size&amp;gt;();
        size_t mem_free = d.get_info&amp;lt;sycl::ext::intel::info::device::free_memory&amp;gt;();
        printf("  GPU %2zu: capacity = %12zu B = %6zu MiB, free = %12zu B = %6zu MiB\n", i, mem_capacity, mem_capacity &amp;gt;&amp;gt; 20, mem_free, mem_free &amp;gt;&amp;gt; 20);
    }

    sycl::queue q(gpus_levelzero[gpu_idx]);

    size_t allocsize = (size_t{60} &amp;lt;&amp;lt; 30);
    void * ptr = sycl::malloc_device(allocsize, q);
    printf("Allocated on GPU %2d: %zu B = %zu MiB, ptr = %p\n", gpu_idx, allocsize, allocsize &amp;gt;&amp;gt; 20, ptr);

    printf("Current free memory:\n");
    for(size_t i = 0; i &amp;lt; gpus_levelzero.size(); i++)
    {
        sycl::device &amp;amp; d = gpus_levelzero[i];
        size_t mem_free = d.get_info&amp;lt;sycl::ext::intel::info::device::free_memory&amp;gt;();
        printf("  GPU %2zu: free = %12zu B = %6zu MiB\n", i, mem_free, mem_free &amp;gt;&amp;gt; 20);
    }

    sycl::free(ptr, q);
    printf("Memory freed\n");

    printf("Current free memory:\n");
    for(size_t i = 0; i &amp;lt; gpus_levelzero.size(); i++)
    {
        sycl::device &amp;amp; d = gpus_levelzero[i];
        size_t mem_free = d.get_info&amp;lt;sycl::ext::intel::info::device::free_memory&amp;gt;();
        printf("  GPU %2zu: free = %12zu B = %6zu MiB\n", i, mem_free, mem_free &amp;gt;&amp;gt; 20);
    }

    return 0;
}&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;It finds all level zero GPU devices and reports their memory capacity and free memory. It then allocates 60 GiB of memory on a given GPU. Then it again prints the amount of free memory on each GPU. At the end it frees the memory and again reports the free memory on each GPU.&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;Compile with&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;icpx -fsycl source.cpp -o program.x&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;and run as&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;./program.x &amp;lt;device_index_where_to_allocate&amp;gt;&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;output with `./program.x 0` (or any other even number lower than number of gpus) (shortened):&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="none"&gt;There are 16 levelzero GPUs
  GPU  0: capacity =  68719476736 B =  65536 MiB, free =  68673966080 B =  65492 MiB
  GPU  1: capacity =  68719476736 B =  65536 MiB, free =  68673966080 B =  65492 MiB
  GPU  2: capacity =  68719476736 B =  65536 MiB, free =  68673970176 B =  65492 MiB
  GPU  3: capacity =  68719476736 B =  65536 MiB, free =  68673970176 B =  65492 MiB
 ...
Allocated on GPU  0: 64424509440 B = 61440 MiB, ptr = 0xff00000000200000
Current free memory:
  GPU  0: free =   4119027712 B =   3928 MiB
  GPU  1: free =   4119027712 B =   3928 MiB
  GPU  2: free =  68547817472 B =  65372 MiB
  GPU  3: free =  68547817472 B =  65372 MiB
 ...
Memory freed
Current free memory:
  GPU  0: free =   4119089152 B =   3928 MiB
  GPU  1: free =   4119142400 B =   3928 MiB
  GPU  2: free =  68548362240 B =  65372 MiB
  GPU  3: free =  68548427776 B =  65372 MiB
 ...&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;You see, I allocated 60 GiB only on GPU 0, but GPU 1 also reports lower free memory.&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;output with `./program.x 1` (or any other odd number lower than number of gpus) (shortened):&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="none"&gt;There are 16 levelzero GPUs
  GPU  0: capacity =  68719476736 B =  65536 MiB, free =  68673966080 B =  65492 MiB
  GPU  1: capacity =  68719476736 B =  65536 MiB, free =  68673966080 B =  65492 MiB
  GPU  2: capacity =  68719476736 B =  65536 MiB, free =  68673974272 B =  65492 MiB
  GPU  3: capacity =  68719476736 B =  65536 MiB, free =  68673974272 B =  65492 MiB
 ...
Allocated on GPU  1: 64424509440 B = 61440 MiB, ptr = 0xff00000000200000
Current free memory:
  GPU  0: free =  68543537152 B =  65368 MiB
  GPU  1: free =  68543537152 B =  65368 MiB
  GPU  2: free =  68547821568 B =  65372 MiB
  GPU  3: free =  68547821568 B =  65372 MiB
 ...
Memory freed
Current free memory:
  GPU  0: free =  68543533056 B =  65368 MiB
  GPU  1: free =  68543533056 B =  65368 MiB
  GPU  2: free =  68548452352 B =  65372 MiB
  GPU  3: free =  68548493312 B =  65372 MiB
 ...&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;Here, I allocated 60 GiB on GPU 1, but the free memory function reports that almost all the memory is still free.&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;Furthermore (this is probably a separate issue), after I free the memory using sycl::free, querrying the free device memory still treats is as allocated -- see the last group of free memory reports in the outputs.&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;Link to docs about the device hierarchy:&amp;nbsp;&lt;A href="https://www.intel.com/content/www/us/en/docs/oneapi/optimization-guide-gpu/2025-0/exposing-the-device-hierarchy.html" target="_blank" rel="noopener"&gt;https://www.intel.com/content/www/us/en/docs/oneapi/optimization-guide-gpu/2025-0/exposing-the-device-hierarchy.html&lt;/A&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;Am I doing something wrong? Is this expected on the 2-stack GPU? Can this be fixed?&lt;/DIV&gt;&lt;DIV&gt;Thanks,&lt;/DIV&gt;&lt;DIV&gt;Jakub&lt;/DIV&gt;</description>
      <pubDate>Sun, 24 Nov 2024 15:21:10 GMT</pubDate>
      <guid>https://community.intel.com/t5/GPU-Compute-Software/SYCL-device-info-free-memory-wrong-on-2-stack-PVC1550-GPU/m-p/1645347#M1655</guid>
      <dc:creator>JakubH</dc:creator>
      <dc:date>2024-11-24T15:21:10Z</dc:date>
    </item>
    <item>
      <title>Re: SYCL device info free_memory wrong on 2-stack PVC1550 GPU</title>
      <link>https://community.intel.com/t5/GPU-Compute-Software/SYCL-device-info-free-memory-wrong-on-2-stack-PVC1550-GPU/m-p/1716759#M2078</link>
      <description>&lt;P&gt;I report that I still observe the same incorrect behavior with version 2025.2.1.&lt;/P&gt;</description>
      <pubDate>Sat, 13 Sep 2025 10:34:56 GMT</pubDate>
      <guid>https://community.intel.com/t5/GPU-Compute-Software/SYCL-device-info-free-memory-wrong-on-2-stack-PVC1550-GPU/m-p/1716759#M2078</guid>
      <dc:creator>JakubH</dc:creator>
      <dc:date>2025-09-13T10:34:56Z</dc:date>
    </item>
  </channel>
</rss>

