Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Size of binaries

benh
Beginner
697 Views
What's up with the binaries (executables) produced by IVF8.1?

When I compare a compilation of exactly the same source files produced used VC6.0+CVF6.6 with those output from VC7.1+IVF8.1.030, I get the following result in my "sample" Windows EXE file:

Debug build (non-optimized):
CVF - 5033 KB
IVF - 8288 KB

Release build (with optimization):
CVF - 1584 KB
IVF - 6152 KB

Arguably, different levels of optimization capabilities could explain the release build, but that doesn't explain the other differences. Still I think the difference is suspiciously large even for opt. differences... This makes me a bit concerned about memory footprint of this particular application, as the application in question already puts high demands on memory.

(Runtime libraries are all multithreaded DLLs).

Regards
-+-Ben-+-
0 Kudos
5 Replies
Steven_L_Intel1
Employee
697 Views
The debug build I can understand - until very recently, bounds checking caused a significant inflation of code size. (Data, not instructions.) Do you by any chance have bounds checking enabled in your release configuration? What options are you using for each compiler on the release build? (Show all the switches shown under Project Options in CVF or Command Line in IVF.)
0 Kudos
TimP
Honored Contributor III
697 Views
Without a specific indication of which optimizations you used, or the nature of the application, my comments may not be useful. I'm not familiar with which options, if any, would be used to optimize for size in CVF. ifort isn't geared to optimization for size, evidently. I suppose you could try ifort -Qprec with otherwise default optimizations (no vectorization). That might be fairly comparable with df /optimize:4 (or 5) /fast
No way a normal vectorized ifort -QxW build could match the size of a CVF build. There isn't much intersection between the two compilers in specific CPU targets.
0 Kudos
benh
Beginner
697 Views
Thanks for the tips! I did some experiments with the based on the options mentioned above.

First, the full set of options I had in my release build were mostly these for the .for files:

/nologo /Ob2 /Oy- /QaxP /Qparallel /assume:buffered_io
/include:".Release2/" /define:RELEASE_EXTERNAL /define:NDEBUG
/define:DEBUG_CPU /define:COMPILER_IS_IVF /extend_source:132
/fpscomp:ioformat /fpscomp:logicals /warn:errors /stand:f90
/warn:declarations /warn:unused /warn:truncated_source /Qsave
/align:commons /align:sequence /Qzero /fpe:0 /Op /fpconstant
/Qfpstkchk /names:as_is /iface:cvf /module:"$(INTDIR)/"
/object:"$(INTDIR)/" /traceback /check:bounds /check:format
/check:output_conversion /check:arg_temp_created
/libs:dll /threads /c


I found a significant reduction in the resulting binary size if removing either or both of these options:
/QaxP /check:bounds

Other changes that had effect on binary, but not as significant, were adding /Qunroll0 and/or removing /Qparallel.

Resulting binary sizes for this specific example were:
2964 KB by removing /check:bounds
4516 KB by removing /QaxP
1912 KB by removing both
1772 KB, previous + adding /Qunroll0, removing /Qparallel

This is fairly close to the CVF size achieved, and is IMHO a reasonable explanation of the observations.

It also shows that if /check:bounds code size inflation was supposed to have been "fixed" in 8.1.030, then that obviously doesn't seem to work well given the above combination of switches.

(And of course, nothing is conclusive regarding execution times.)

-+-Ben-+-
0 Kudos
Steven_L_Intel1
Employee
697 Views
The fix for /check:bounds was made some time ago - without the fix, things would be MUCH worse. /QaxP does inflate code size as it creates dual code paths for certain sections of code.
0 Kudos
benh
Beginner
697 Views
I'm still a bit puzzled by these observations though. Surely, I did ask for "optimize for speed", not size, so it isn't really ideal to compare the two compiler's performance based of code size in a fair way.

I also found out that our CVF project did not apply "/check:bounds" either, so I turned it ON and recompiled. The full set of options for the release build with CVF then becomes:

/alignment:commons /alignment:sequence /assume:buffered_io
/check:bounds /check:noflawed_pentium /compile_only /debug:none
/define:"RELEASE_EXTERNAL" /define:"NDEBUG" /define:"DEBUG_CPU"
/extend_source:132 /fltconsistency /fpconstant /fpe:0
/fpscomp:logicals /fpscomp:ioformat /fpscomp:symbols
/include:".Release/" /libs:dll /math_library:fast /names:as_is
/nologo /optimize:3 /warn:argument_checking /warn:declarations
/warn:errors /warn:unused /module:"Release/" /object:"Release/"

This caused the binary size to increase with a mere 500 KB to 2032 KB, compared to the "cost" of almost 3200 KB (more than 6 times increase!) that IVF requires to achieve the same thing. The extra "cost" of course is less without the "code-doubling effect" of IVF's /QaxP ("only" 2600 KB), but still a bit demanding compared to CVF, I think.

IVF also does its compilations very much slower than CVF does. Perhaps there are some other differences in these two set of options I've listed that can explain why?

I'd like to set up the compilation with IVF as fast as it did with CVF, with approximately the same features. So far it seems we don't really gain much in everyday usage by switching from CVF to IVF, except being more or less forced to switch in order to be compatible with Studio .NET at all... But I guess we'll have to consider the potensials of /QaxP and /Qparallel too before we conclude.

-+-Ben-+-
0 Kudos
Reply