- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We verified that ifort 11.0 and 11.1 option -ipo (also implied by -fast) breaks the MPI_Bcast call in a NAS Parallel benchmark. Even though the data types in the labeled COMMON which is set up as a Bcast buffer are all 32-bit, -ipo pads some to 64-bit boundaries. This breaks the legacy assumption that COMMON padding doesn't occur except as needed to move items to a boundary which is a multiple of their size.
The option -no-ansi-alias will suppress this optimization, as well as allowing code which violates the standard about aliased subroutine arguments, thus suppressing optimizations which depend on lack of argument aliasing.
The correct method is considered to be the use of an array for the Bcast buffer, using EQUIVALENCE or TRANSFER when mixed data types are needed.
The option -no-ansi-alias will suppress this optimization, as well as allowing code which violates the standard about aliased subroutine arguments, thus suppressing optimizations which depend on lack of argument aliasing.
The correct method is considered to be the use of an array for the Bcast buffer, using EQUIVALENCE or TRANSFER when mixed data types are needed.
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - tim18
We verified that ifort 11.0 and 11.1 option -ipo (also implied by -fast) breaks the MPI_Bcast call in a NAS Parallel benchmark. Even though the data types in the labeled COMMON which is set up as a Bcast buffer are all 32-bit, -ipo pads some to 64-bit boundaries. This breaks the legacy assumption that COMMON padding doesn't occur except as needed to move items to a boundary which is a multiple of their size.
The option -no-ansi-alias will suppress this optimization, as well as allowing code which violates the standard about aliased subroutine arguments, thus suppressing optimizations which depend on lack of argument aliasing.
The correct method is considered to be the use of an array for the Bcast buffer, using EQUIVALENCE or TRANSFER when mixed data types are needed.
The option -no-ansi-alias will suppress this optimization, as well as allowing code which violates the standard about aliased subroutine arguments, thus suppressing optimizations which depend on lack of argument aliasing.
The correct method is considered to be the use of an array for the Bcast buffer, using EQUIVALENCE or TRANSFER when mixed data types are needed.
Hi tim18,
Thank youfor the investigation.
Do you have access to the internal Intel Tracker? Could you submit a bug report about this issue? If not I can do it.
Regards!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Dmitry Kuzmin (Intel)
Hi tim18,
Thank youfor the investigation.
Do you have access to the internal Intel Tracker? Could you submit a bug report about this issue? If not I can do it.
Regards!
I've also looked up publications by the authors of the offending code (and spoken with one of them), and it appearsthey no longer recommend the use of COMMON for a bcast buffer in the way they did in the original version of NPB.
As there is an API requirement for local objects of 16 bytes or more to take 16-byte alignment, it should not be surprising if ipo would extend that alignment to objects in a COMMON. However, it is surprising for padding to be applied to 4 byte objects.
If you have further evidence on this question, it would be useful if you would submit your tracker.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page