- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Greetings,
I've found an odd performance regression in ifx/ifort output, when a subroutine passes a contiguous array slice to a procedure argument, when the latter argument also defines the array parameter as contiguous.
Minimal complete example (with just enough nontrivial work to make sure the code is all executed as intended):
program qux
implicit none
integer, allocatable, dimension(:,:) :: foo
integer ii, cc
allocate(foo(100,100))
cc = 0
do ii=1,10000000
call test_a(baz)
end do
write(0,'(I0)') foo(1,1)
contains
subroutine baz(bar)
implicit none
integer, dimension(:), intent(inout), contiguous :: bar
cc = cc+1
bar(1+mod(cc,100)) = cc
end subroutine
subroutine test_a(in_proc)
implicit none
interface
subroutine in_proc(bar)
implicit none
integer, dimension(:), intent(inout), contiguous :: bar
end subroutine
end interface
call in_proc(foo(1:,1))
end subroutine
end program
When run on a Xeon Platinum 8380:
$ ifx --version
ifx (IFORT) 2023.0.0 20221201
Copyright (C) 1985-2022 Intel Corporation. All rights reserved.
$ ifx -O3 foo.F90 && time ./a.out
10000000
real 0m3.880s
user 0m3.864s
sys 0m0.003s
But after replacing in_proc(foo(1:,:)) with in_proc(foo(:,1)) at line 27:
$ ifx -O3 foo.F90 && time ./a.out
10000000
real 0m0.017s
user 0m0.014s
sys 0m0.003s
Or keeping in_proc(foo(1:,:)) but dropping the contiguous at line 24:
$ ifx -O3 foo.F90 && time ./a.out
10000000
real 0m0.019s
user 0m0.015s
sys 0m0.004s
A godbolt comparison of cases #1 and #2 shows that #1 does a great deal more work inside test_a, including several tests and a loop.
This behaviour was originally discovered when -qopt-report=5 on ifort 2021.8.0 surprisingly said "memcopy generated" for the in-an-actual-code predecessor of this sample. The generated memcopy seems to be related to the array descriptor rather than the array itself (creating a temporary copy); the running times do not change if the allocation of foo is enlarged.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@csubich, thank you for reporting this performance issue. I like this small reproducer. I filed a bug report, CMPLRLLVM-48930.
@hakostra1, thanks for pointing out the similar issue. I let the compiler engineers know that the 2 bug reports might be related. I'll let them decide.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here is another issue with the same topic (compiler generate temporaries for contiguous slices):
If you compile your code with "-check arg_temp_created" or "-check all" you will get a message that tells you when a temporary is created. In that way it is easier to discover than waiting for performance bottlenecks to appear.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I tried "-check arg_temp_created" or "-check all" with the reproducer from @csubich. No messages.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@csubich, thank you for reporting this performance issue. I like this small reproducer. I filed a bug report, CMPLRLLVM-48930.
@hakostra1, thanks for pointing out the similar issue. I let the compiler engineers know that the 2 bug reports might be related. I'll let them decide.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Barbara_P_Intel , it would seem you have read my book, "The Wisdom of dealing with Engineers", published by Bantam Press in 1675, rule number 1313, always allow the engineer to decide, otherwise they cry foul to their mothers.
I had this last week when a contractor told me he never made engineering decisions that was for the engineer, I then asked him if we could do it that way and he said no. I laughed. Actually a true story.
Engineers are like two year old's, you cannot say anything negative or tell them how it is done, they know already, they were taught by the great professors and they will follow that bible till a new bible comes along.
When once asked what I did, I laughed and said, write new bibles.
John
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I just checked the performance with an internal build of the next ifx compiler version 2024.2. The performance is MUCH improved.
Look for this release in mid-2024.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page