- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am looking for examples of parallelizable code that can not be auto-parallelized by the Intel's compiler.
Thank you,
David
http://www.dalsoft.com
I am looking for examples of parallelizable code that can not be auto-parallelized by the Intel's compiler.
Thank you,
David
http://www.dalsoft.com
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In the classic benchmark,
http://www.netlib.org/benchmark/vectors
there are 12 (according to my testing) usefully parallelizable loops (as opposed to, or in addition to, auto-vectorization). ifort 10.1 could parallelize a majority of them with help, including re-writing in form suitable for OpenMP parallelization. In a few cases, ifort 10.1 -Qparallel uncovered loop interchange optimizations which were better exploited without parallelization. Earlier and later versions of ifort don't auto-parallelize this successfully.
http://sun.systemnews.com/articles/135/1/Performance/21682 pointed up some cases in SPECfp2006 where Sun compilers parallelized successfully, while the Intel compilers which were current at that time were not as successful. Note also that while auto-parallel improves performance of SPECfp, it reduces performance of SPECfp_rate. The Sun compiler optimizes specifically for Xeon 5500 series, but not for earlier Intel CPUs. SPECfp2006 includes a case (in C) where full performance requires parallelization of both first-touch initialization and working loops, with thread pool persisting across multiple loops.
http://www.netlib.org/benchmark/vectors
there are 12 (according to my testing) usefully parallelizable loops (as opposed to, or in addition to, auto-vectorization). ifort 10.1 could parallelize a majority of them with help, including re-writing in form suitable for OpenMP parallelization. In a few cases, ifort 10.1 -Qparallel uncovered loop interchange optimizations which were better exploited without parallelization. Earlier and later versions of ifort don't auto-parallelize this successfully.
http://sun.systemnews.com/articles/135/1/Performance/21682 pointed up some cases in SPECfp2006 where Sun compilers parallelized successfully, while the Intel compilers which were current at that time were not as successful. Note also that while auto-parallel improves performance of SPECfp, it reduces performance of SPECfp_rate. The Sun compiler optimizes specifically for Xeon 5500 series, but not for earlier Intel CPUs. SPECfp2006 includes a case (in C) where full performance requires parallelization of both first-touch initialization and working loops, with thread pool persisting across multiple loops.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page