Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
29282 Discussions

Data alignment issues with /QxSSE2, real_size:64 and polymorphic variables

IanH
Honored Contributor III
955 Views
The attached, with a reasonably specific set of compiler options, shows what I think is an "optimiser" bug, where one of those funky SSE2+ type opcodes is used to move some data but the destination is occasionally a memory address (associated with a temporary) that is not appropriately aligned.

optim-check.f90

optim-check-modules.f90


(No useful output is generated by the example - the debug run is just to show that the code is potentially sane).


[plain]
>ifort /check:all /warn:all optim-check-modules.f90 optim-check.f90 /Feoptim-che
ck-dbg.exe
Intel Visual Fortran Compiler XE for applications running on IA-32, Version 1
2.0.1.127 Build 20101116
Copyright (C) 1985-2010 Intel Corporation.  All rights reserved.

Microsoft  Incremental Linker Version 8.00.50727.762
Copyright (C) Microsoft Corporation.  All rights reserved.

-out:optim-check-dbg.exe
-subsystem:console
optim-check-modules.obj
optim-check.obj


>optim-check-dbg.exe


>ifort /O2 /Oy- /traceback /QxSSE2 /real_size:64 /warn:all optim-check-modules.f
90 optim-check.f90 /Feoptim-check-opt.exe
Intel Visual Fortran Compiler XE for applications running on IA-32, Version 1
2.0.1.127 Build 20101116
Copyright (C) 1985-2010 Intel Corporation.  All rights reserved.

Microsoft  Incremental Linker Version 8.00.50727.762
Copyright (C) Microsoft Corporation.  All rights reserved.

-out:optim-check-opt.exe
-subsystem:console
-incremental:no
optim-check-modules.obj
optim-check.obj


>optim-check-opt.exe
forrtl: severe (157): Program Exception - access violation
Image              PC        Routine            Line        Source
optim-check-opt.e  00401183  _DATAOBJECTS_mp_A          30  optim-check-modules.
f90
optim-check-opt.e  00401FE3  _INTERMEDIATE_mp_          88  optim-check.f90
optim-check-opt.e  0040186E  _INTERMEDIATE_mp_          68  optim-check.f90
optim-check-opt.e  0040120C  _MAIN__                    99  optim-check.f90
optim-check-opt.e  00430E63  Unknown               Unknown  Unknown
optim-check-opt.e  0041B9A9  Unknown               Unknown  Unknown
kernel32.dll       7C817077  Unknown               Unknown  Unknown
[/plain]

Apologies for it not necessarily being the minimum reproducer, but I've had my fill of staring at assembly code today...
0 Kudos
6 Replies
jimdempseyatthecove
Honored Contributor III
955 Views
IanH,

Good work on isolating the bug. Until Steve has a look at the code it looks like thestack temporary allocations are not abiding by alignment. Unfortunately you did not include ContainerObjects nor DataObjects.

It is unknown what ContA::data and ContB::dataare. Assumption being an array of REALs and thus maybe an SSE movaps using non-aligned memory reference. Your error diagnostic did not indicate if the problem was alignment or unmapped/protected memory page.

In this case, and in all cases when generation of temporary is required by specification due to uncertanties of possibility of aliases producing an unfortunate overlapping of data, it might be nice to have an attribute NOALIAS

Prototype of suggestion follows:

CLASS(ContB), POINTER, NOALIAS:: s1 => NULL()

CLASS(ContA), POINTER, NOALIAS :: s2 => NULL()

CLASS(ContB), POINTER, NOALIAS :: s3 => NULL()

Then the

s3%

data = s1%data + s2%data

would not genterate the stack temporary
(I do not have IFV 2003, this may be an attribute already)

Jim Dempsey

0 Kudos
IanH
Honored Contributor III
955 Views
Sorry - I think the needed modules are there - they are in the same source file as the main program. I was moving program units around a bit because the bug is sensitive to that (a consequence of whole-source-file optimisation I guess?).

I guess another way of working around the temporary would be just to map the a = b + c statement into a subroutine call with three non-pointer arguments - then aliasing rules would allow the compiler to assume inside the subroutine that the programmer hasn't aliased the arguments?

0 Kudos
Wendy_Doerner__Intel
Valued Contributor I
955 Views
Thanks for the test case. I will take a look at it and post my analysis here.

------

Wendy

Attaching or including files in a post



0 Kudos
IanH
Honored Contributor III
955 Views
Hello Wendy - did you have any luck with this? I'm curious as to whether this is really an optimiser problem, or whether it is fortran programmer error...
0 Kudos
Wendy_Doerner__Intel
Valued Contributor I
955 Views
Ian,

This test case no longer gives an access violation with the 12.0 Update 2 compiler. Can you check on your end whether the optimization aligment is fixed for you? This update should be posted by the end of the week.

------

Wendy

Attaching or including files in a post

0 Kudos
IanH
Honored Contributor III
955 Views
Ok - will do. Thanks for your help.
0 Kudos
Reply