Re: Aug 30th (Wednesday) Townhall with the Intel® Fortran Compiler Developers

Ron_Green · ‎08-25-2023

Wednesday August 30th

9:00am US PDT

Townhall with the Intel® Fortran Compiler Developers

You have heard all about The Next Chapter for the Intel^® Fortran Compiler. Now it’s your turn to give us your feedback on our compiler in this townhall webinar with the developers behind the Intel^® Fortran Compiler. We will have a short preview of what is coming in version 2024.0. And we will show off our new uninitialized memory checking feature in our Linux* compiler version 2023.2.0. But the real focus of this meeting is to give you a chance get your questions answered. Bring your questions, bring your suggestions, and we look forward to a sharing the latest information on our Fortran compiler.

Click here to watch the replay on demand. Other webinars are also listed there that you may find interesting.

fortrandave · ‎08-26-2023

This sounds really interesting! Do you know if it will be recorded for on-demand viewing?

David_Billinghurst · ‎08-27-2023

A recording would be great. This is at 02:00 on the east coast of Australia.

jvo203 · ‎08-27-2023

Yes, the same applies to Japan @ 1am JST, it would be nice to view the recording afterwards.

Arjen_Markus · ‎08-27-2023

Likewise for Europe ;). Less awkward, perhaps, still not very practical.

But aside from that, could you refer to the time of this and future events also in UTC? Then people around the world will only have to convert from one fixed timezone to their own. Timezones and their abbreviations may be standardised, but that does not mean that they are unequivocal or that people have them memorised.

Barbara_P_Intel · ‎08-28-2023

Yes, it will be recorded. BUT to watch the recording you need to register for the webinar. You will later get an email when the recording is available and how to watch it.

Steve_Lionel · ‎08-30-2023

I registered for this when it was announced, never got confirmation nor day-of notice. I tried again - nothing. Have others received their link?

Barbara_P_Intel · ‎08-30-2023

Oh, no! Ron says that this is a new webinar platform. Wonder if that's the problem.

There were over 100 people on the live webinar.

Ron says a link to the recording will be posted here when it's available.

Steve_Lionel · ‎08-30-2023

I'm sorry to have missed it - was looking forward to it. The link will be appreciated when available.

Ron_Green · ‎08-30-2023

We'll follow up and see if there was some sort of glitch in the registration/notification process. the Webinar was part of Tech Decoded and the recording should show up there soon. we will create a new post with the link to the recording.

Barbara_P_Intel · ‎09-01-2023

The replay of the Fortran Townhall webinar is now available. Click here. Other webinars are also listed there that you may find interesting.

jvo203 · ‎09-01-2023

Having watched the video, it was good to get a brief historical context on the ifort --> ifx transition. The hands-on demo about sanitizers (uninitialised memory) was informative - I wasn't aware of these new capabilities. Plus the enhanced "do concurrent" (Fortran 2023 standard is coming soon) with OpenMP directives and off-load - learnt some new things. Thank you.

jvo203 · ‎09-02-2023

This begs the question: what is the difference between the enhanced do concurrent and an OpenMP parallel do? For the 1D case (i.e. i = 1:N, no other indexes) there does not seem to be much difference now. Perhaps things get more complicated in higher dimensions.

Steve_Lionel · ‎09-02-2023

DO CONCURRENT is part of the Fortran standard, OpenMP is not. DO CONCURRENT does not require parallelism and does not have as fine-grained controls as OMP PARALLEL DO. Its features are modeled after OpenMP, however.

Note that with Intel Fortran DO CONCURRENT doesn't parallelize unless you have asked for parallelization on the command line.

jvo203 · ‎09-02-2023

"Note that with Intel Fortran DO CONCURRENT doesn't parallelize unless you have asked for parallelization on the command line."

Yes, that's true for the ifort which needs the flag "-parallel". The ifx, on the other hand, parallelises DO CONCURRENT by default. It seems the lines are getting blurred between DO CONCURRENT and the OMP PARALLEL DO.

Barbara_P_Intel · ‎09-05-2023

@jvo203, regarding your statement "ifx, on the other hand, parallelises DO CONCURRENT by default." I can't duplicate that. Do you have a reproducer?

On Linux I used ldd to confirm the libraries that are required at runtime. libiomp5.so is the library required for OpenMP. I only see that library when I compile/link with the -qopenmp compiler option.

+ ifx test_doconcurrent.f90
+ a.out
  sumc = 300,000 =   300000.0
+ ldd a.out
        linux-vdso.so.1 (0x00007ffef43d1000)
        libimf.so => /nfs/pdx/disks/cts2/tools/oneapi/2023.2.0/compiler/2023.2.0/linux/compiler/lib/intel64_lin/libimf.so (0x00007f8b8d660000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f8b8d563000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f8b8d385000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f8b8d35f000)
        libintlc.so.5 => /nfs/pdx/disks/cts2/tools/oneapi/2023.2.0/compiler/2023.2.0/linux/compiler/lib/intel64_lin/libintlc.so.5 (0x00007f8b8d2e7000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f8b8da4c000)

+ ifx -qopenmp test_doconcurrent.f90
+ a.out
  sumc = 300,000 =   300000.0
+ ldd a.out
        linux-vdso.so.1 (0x00007ffd7f9d3000)
        libimf.so => /nfs/pdx/disks/cts2/tools/oneapi/2023.2.0/compiler/2023.2.0/linux/compiler/lib/intel64_lin/libimf.so (0x00007f7bce310000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f7bce213000)
        libiomp5.so => /nfs/pdx/disks/cts2/tools/oneapi/2023.2.0/compiler/2023.2.0/linux/compiler/lib/intel64_lin/libiomp5.so (0x00007f7bcdc00000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f7bcda22000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f7bce1ed000)
        libintlc.so.5 => /nfs/pdx/disks/cts2/tools/oneapi/2023.2.0/compiler/2023.2.0/linux/compiler/lib/intel64_lin/libintlc.so.5 (0x00007f7bce175000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f7bce6fc000)
        librt.so.1 => /lib64/librt.so.1 (0x00007f7bce170000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f7bce16b000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f7bce166000)

Note: -parallel is not supported with ifx.

jvo203 · ‎09-05-2023

Hi Barbara, yes the sample reproducer is included in my bug report about OpenMP nested parallelism and DO CONCURRENT. Please see the code and a lively discussion here: community.intel.com/t5/Intel-Fortran-Compiler/ifx-OpenMP-compilation-segmentation-fault/m-p/1512952#M167649

Yes, my Makefile uses "-qopenmp" by default. With "-qopenmp" the DO CONCURRENT seems to get auto-parallelised by the ifx (and yes, the "-parallel" flag got axed in ifx).

Barbara_P_Intel · ‎09-05-2023

Ah! I remember that topic. Ron is on top of that one, nested parallelism. That's a different issue than what you wrote here.

With a straight DO CONCURRENT, no nesting within !$OMP, -qopenmp is required for parallelism.

John_Campbell · ‎09-04-2023

I have a question about the DO CONCURRENT example from the "Town Hall".

The example is a very simple loop, (from memory):

DO CONCURRENT (i = 1:N) shared (a,b,c)

c(i) = a(i) + b(i)

end do

My question relates to the overhead of implementing this loop and the lack of computation per cycle. Basically this loop could never be practical in !$OMP DO PARALLEL, where

For small N : can never overcome the startup overhead of about 20,000 processor cycles

For large N : memory bandwidth is not sufficient to improve computation speed. (memory bandwidth would kill any similar loop!)

Would this DO CONCURRENT example of off-loading ever achieve a performance advantaged ?

What is an estimate of the startup overhead in processor cycles ?

If the answer is similar to achieving a practical OpenMP example, how do you scale up the computation per cycle, given possible limitations on what can be included in a DO C.. loop ?

Barbara_P_Intel · ‎09-11-2023

For the small amount of compute in that vector add loop, the overhead to offload the data will likely negate any performance improvement. The more compute you can do on the offloaded data, the better.

With OpenMP directives you have control over what data you move back to the host at the end of the compute kernel. I don't see that with DO CONCURRENT. Depending on the application that can make a performance difference.

John_Campbell · ‎11-09-2023

My impression of "DO CONCURRENT" is that with it's "pure" limitations, it is positioned to do a small amount of compute in each loop itteration.

Not really the type of loop for OpenMP, except for examples ?

Aug 30th (Wednesday) Townhall with the Intel® Fortran Compiler Developers

Fortran Language