Performance of CVF6.6 and IF7 optimized code

kdkeefer · ‎02-03-2003

I work on a large Monte Carlo problem (literally hundreds of billions of iterations and tens of hours of CPU). CVF6.6 code, generated with a variety of optimization options, always runs 15-25% faster than IF7 code, even with the vectorizer and IPO. (I use a 1.8GHz P4, with 512MB 800MHz rambus-ASUS PT4mb w/ I 850 chipset).
I've rewritten code to avoid pipeline flushes by generating random numbers in a separate loop and storing them in arrays, so they are not mixed with memory accesses (a very unnatural style for old fart programmers used to register arithmetic being faster than memory accesses, especially when cache space is at a premium).
Both CVF and IF7 ran faster (5% and 10%, respectively), but the presumeably less processor specific CVF code still was significantly faster (25% of 100 hours means I get results a day sooner).
I have had previous experience with, e.g. Lahey, which used Intel code generation and whose code improved dramatically when they switched to Fujitsu generators.
I do not intend to malign Intel team members, but will the "convergence" of DVF/CVF and IF include whatever techniques and knowledge the former DEC members used to produce a superior product, now that both (former) sides can share their wisdom?
Sincerely,
Keith

Steven_L_Intel1 · ‎02-03-2003

We sure hope so...

A lot of the performance problems, relative to CVF, of the Intel Fortran compiler come from its front-end, not the code generator. The Intel front-end has not had the attention paid to it that the Compaq FE has regarding important performance optimizations such as eliminating unnecessary array temporaries.

On the code-generator side, a lot of the Compaq "GEM" optimizer/CG team is now working for Intel and applying their experience and expertise to improving the Intel code generators.

It's impossible to make blanket predictions, but we hope you'll be pleased by the end result.

Steve