<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Problem running a simple program using OpenMP and GPU in Intel® Fortran Compiler</title>
    <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Problem-running-a-simple-program-using-OpenMP-and-GPU/m-p/1734525#M178111</link>
    <description>&lt;P&gt;Thanks for these data. So, GPUs are indeed able to speed up things. Now all I have to do is find a way that it also works on my computer ...&lt;/P&gt;</description>
    <pubDate>Fri, 23 Jan 2026 07:37:50 GMT</pubDate>
    <dc:creator>Arjen_Markus</dc:creator>
    <dc:date>2026-01-23T07:37:50Z</dc:date>
    <item>
      <title>Problem running a simple program using OpenMP and GPU</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Problem-running-a-simple-program-using-OpenMP-and-GPU/m-p/1734058#M178076</link>
      <description>&lt;P&gt;I have been trying to use OpenMP with offloading to a GPU. The program is quite simple, but I run into a problem that I cannot diagnose. It runs fine if the size of the matrix is 128x128 (n = 128 in the program). If I use a larger value the result is a crash:&lt;/P&gt;&lt;LI-CODE lang="none"&gt;--- failure if n &amp;gt; 128 ---
...&amp;gt;ifx diffu_gpu.f90 -Qopenmp -Qopenmp-targets=spir64
Intel(R) Fortran Compiler for applications running on Intel(R) 64, Version 2025.0.0 Build 20241008
Copyright (C) 1985-2024 Intel Corporation. All rights reserved.

Microsoft (R) Incremental Linker Version 14.44.35217.0
Copyright (C) Microsoft Corporation.  All rights reserved.

-out:diffu_gpu.exe
-subsystem:console
-defaultlib:libiomp5md.lib
-nodefaultlib:vcomp.lib
-nodefaultlib:vcompd.lib
C:\Users\markus\AppData\Local\Temp\17608731760846.obj
C:\Users\markus\AppData\Local\Temp\17608414llc.o
-defaultlib:omptarget.lib

...&amp;gt;diffu_gpu
 Start time loop ...
omptarget error: Executing target region abort target.
omptarget error: Run with
omptarget error: LIBOMPTARGET_DEBUG=1 to display basic debug information.
omptarget error: LIBOMPTARGET_DEBUG=2 to display calls to the compute runtime.
omptarget error: LIBOMPTARGET_INFO=4 to dump host-target pointer mappings.
omptarget error: Source location information not present. Compile with -g or -gline-tables-only.
omptarget fatal error 1: failure of target construct while offloading is mandatory
&lt;/LI-CODE&gt;&lt;P&gt;I have attached the program. I have also tried to use GPU teams, but I am afraid I simply do not understand how to use the directives. In any case, it made no difference.&lt;/P&gt;</description>
      <pubDate>Mon, 19 Jan 2026 20:35:38 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Problem-running-a-simple-program-using-OpenMP-and-GPU/m-p/1734058#M178076</guid>
      <dc:creator>Arjen_Markus</dc:creator>
      <dc:date>2026-01-19T20:35:38Z</dc:date>
    </item>
    <item>
      <title>Re: Problem running a simple program using OpenMP and GPU</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Problem-running-a-simple-program-using-OpenMP-and-GPU/m-p/1734107#M178077</link>
      <description>&lt;P&gt;This works - I think - on Windows 11 VS 2022, I had to add the set threads call and turn on openmp in the properties page&lt;/P&gt;&lt;P&gt;It runs for 2280 in 25 seconds with it on and 66 with it off, the cpu time is about the same as you would expect.&amp;nbsp; &amp;nbsp;With six threads it is not much faster about 23.&amp;nbsp; Diminishing returns as I understand for more threads&lt;/P&gt;&lt;P&gt;It runs in 7 seconds for 1180 as I have 4 threads.&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have no idea about the GPU bit, I thought we needed CUDA for that as I have NVIDIA Card&lt;/P&gt;&lt;P&gt;But Jim is the expert.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 20 Jan 2026 03:36:33 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Problem-running-a-simple-program-using-OpenMP-and-GPU/m-p/1734107#M178077</guid>
      <dc:creator>JohnNichols</dc:creator>
      <dc:date>2026-01-20T03:36:33Z</dc:date>
    </item>
    <item>
      <title>Re: Problem running a simple program using OpenMP and GPU</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Problem-running-a-simple-program-using-OpenMP-and-GPU/m-p/1734111#M178078</link>
      <description>&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Screenshot 2026-01-19 213948.png" style="width: 952px;"&gt;&lt;img src="https://community.intel.com/t5/image/serverpage/image-id/70778i94C9711910ACEEC1/image-size/large/is-moderation-mode/true?v=v2&amp;amp;px=999&amp;amp;whitelist-exif-data=Orientation%2CResolution%2COriginalDefaultFinalSize%2CCopyright" role="button" title="Screenshot 2026-01-19 213948.png" alt="Screenshot 2026-01-19 213948.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;With one thread the CPU time is a little less but the clock time is 3 times, and I remember something from Jim that the efficiency decreases with increasing threads.&amp;nbsp; But for IFX you need to set a environment variable of call num_threads,&amp;nbsp; I prefer the Fortran way it is easier.&amp;nbsp; &amp;nbsp; So for a core I7 DELL, the times is&amp;nbsp; 1000 takes 7 seconds on 4 threads, 2000 takes 23, so your 10,000 will take 11 minutes.&lt;/P&gt;&lt;P&gt;Thanks I have not done this before.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Screenshot 2026-01-19 214335.png" style="width: 938px;"&gt;&lt;img src="https://community.intel.com/t5/image/serverpage/image-id/70779iE7BFF054C46EEC29/image-size/large/is-moderation-mode/true?v=v2&amp;amp;px=999&amp;amp;whitelist-exif-data=Orientation%2CResolution%2COriginalDefaultFinalSize%2CCopyright" role="button" title="Screenshot 2026-01-19 214335.png" alt="Screenshot 2026-01-19 214335.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 20 Jan 2026 03:45:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Problem-running-a-simple-program-using-OpenMP-and-GPU/m-p/1734111#M178078</guid>
      <dc:creator>JohnNichols</dc:creator>
      <dc:date>2026-01-20T03:45:39Z</dc:date>
    </item>
    <item>
      <title>Re: Problem running a simple program using OpenMP and GPU</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Problem-running-a-simple-program-using-OpenMP-and-GPU/m-p/1734463#M178100</link>
      <description>&lt;P&gt;Thanks for these experiments. Meanwhile, I got a suggestion from Damian Rouson (as a follow-up of his presentation&amp;nbsp;&lt;STRONG&gt;Please, No More Loops (Than Necessary): New Patterns in Fortran 2023&lt;/STRONG&gt;" yesterday) to use instead a DO CONCURRENT loop. This works and I can see that the GPU is very busy with my program. The advantage is clearly that you do not need all these OpenMP directives, but I am currently a bit puzzled about controlling the data transfer. Anyway, the fact that this version of the program does run is a big step forward :).&lt;/P&gt;</description>
      <pubDate>Thu, 22 Jan 2026 20:51:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Problem-running-a-simple-program-using-OpenMP-and-GPU/m-p/1734463#M178100</guid>
      <dc:creator>Arjen_Markus</dc:creator>
      <dc:date>2026-01-22T20:51:46Z</dc:date>
    </item>
    <item>
      <title>Re: Problem running a simple program using OpenMP and GPU</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Problem-running-a-simple-program-using-OpenMP-and-GPU/m-p/1734472#M178101</link>
      <description>&lt;P&gt;I modified the program with the help of Gemini 3 to run on my T14s ThinkPad with a&amp;nbsp;Intel(R) Iris(R) Xe Graphics 12.0.0.&lt;/P&gt;&lt;P&gt;I had to scale back to single precision real(4) because the gpu cannot do double.&lt;/P&gt;&lt;P&gt;Apparently you need to use the&amp;nbsp;Codeplay oneAPI Plugins to use a Nvidia gpu. I have not done that, seems complicated. Perhaps someone can put together a simple working example.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;On my laptop with the Iris(R) Xe I get this:&lt;/P&gt;&lt;P&gt;Starting Performance Comparison...&lt;BR /&gt;Array Size: 100000&lt;BR /&gt;Math intensity: 20000 operations per element&lt;/P&gt;&lt;P&gt;Running on CPU...&lt;BR /&gt;CPU Time: 3.9638 seconds&lt;BR /&gt;Running on GPU (Iris Xe)...&lt;BR /&gt;GPU Time: 0.0577 seconds&lt;/P&gt;&lt;P&gt;Speedup: 68.66x&lt;/P&gt;</description>
      <pubDate>Thu, 22 Jan 2026 23:13:48 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Problem-running-a-simple-program-using-OpenMP-and-GPU/m-p/1734472#M178101</guid>
      <dc:creator>PGC</dc:creator>
      <dc:date>2026-01-22T23:13:48Z</dc:date>
    </item>
    <item>
      <title>Re: Problem running a simple program using OpenMP and GPU</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Problem-running-a-simple-program-using-OpenMP-and-GPU/m-p/1734479#M178102</link>
      <description>&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Screenshot 2026-01-22 192331.png" style="width: 999px;"&gt;&lt;img src="https://community.intel.com/t5/image/serverpage/image-id/70815i3583DC141A6C3B44/image-size/large/is-moderation-mode/true?v=v2&amp;amp;px=999&amp;amp;whitelist-exif-data=Orientation%2CResolution%2COriginalDefaultFinalSize%2CCopyright" role="button" title="Screenshot 2026-01-22 192331.png" alt="Screenshot 2026-01-22 192331.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 23 Jan 2026 03:00:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Problem-running-a-simple-program-using-OpenMP-and-GPU/m-p/1734479#M178102</guid>
      <dc:creator>JohnNichols</dc:creator>
      <dc:date>2026-01-23T03:00:58Z</dc:date>
    </item>
    <item>
      <title>Re: Problem running a simple program using OpenMP and GPU</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Problem-running-a-simple-program-using-OpenMP-and-GPU/m-p/1734525#M178111</link>
      <description>&lt;P&gt;Thanks for these data. So, GPUs are indeed able to speed up things. Now all I have to do is find a way that it also works on my computer ...&lt;/P&gt;</description>
      <pubDate>Fri, 23 Jan 2026 07:37:50 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Problem-running-a-simple-program-using-OpenMP-and-GPU/m-p/1734525#M178111</guid>
      <dc:creator>Arjen_Markus</dc:creator>
      <dc:date>2026-01-23T07:37:50Z</dc:date>
    </item>
  </channel>
</rss>

