Re: making a program run faster...

mdfischer · ‎07-19-2002

Hi

I'm looking for general suggestions on how to make a program run faster. The program I am working on right now takes way too long (~6 days) and I need to figure out how to improve that, but I've never done this before.

I have a 1.5 GHz machine with 512MB RAM and 80GB disk space. I realize that specific solutions will be impossible to provide but any suggestions on where I should start would be appreciated!

Thanks
Michele

njuffa · ‎07-20-2002

The most important thing to do is to check your algorithm(s). That's usually where the biggest gains are. For example, if you can substitute an O(n*log(n)) algorithm for a O(n**2) algorithm it will make a huge difference if n is large.

If you can tell us what your program does, and what the major algorithm(s) used are, someone reading this board might have an idea on how to improve it.

-- Norbert

davinci · ‎07-21-2002

Several things may speed up your program, it depends what you are doing. Unformatted file i/o in large buffers/streams, data compression, nested iterations in the proper order, compilation of a TESTED program in release mode without checks/warnings, memory management, for instance smaller time consuming DLL subroutines running entirely within the first level cache of the computerchip, cpu-parallelism etcetera. There is some literature on efficient Fortran programming and optimisation, try to find it. Anton Kruger, efficient Fortran programming, isbn 0471528943, but more. DaVinci

njuffa · ‎07-21-2002

In terms of books that may be helpful when addressing a performance problem, there are several besides the book by Kruger (which is older and out of print but may be available used; I bought a copy recently via Amazon):

Jon Louis Bentley, "Writing Efficient Programs", Prentice Hall 1982 (also out of print, and quite old)

Kevin Dowd & Charles Severance, "High Performance Computing 2nd ed ", O'Reilly 1998

David Loshin, "Efficient Memory Programming", McGraw-Hill 1998

Stefan Goedecker & Adolfy Hoisie, "Performance Optimization of Numerically Intensive Codes", SIAM 2001

Somewhat platform specific, but still useful for insight into general high-level performance optimization:

Kevin R Wadleigh & Isom L Crawford, "Software Optimization for High Performance Computing", Prentice Hall 2000 (IA64, PA-RISC)

Rajat P. Garg, Ilya A Shaparov, "Techniques for Optimizing Applications: High Performance Computing", Prentice Hall 2001 (Sun SPARC)

On the border between algorithms and performance optimization:

Jon Bentley, "Progamming Pearls 2nd ed", Addison Wesley 2000

Of course there are also optimization guidelines published by various HW vendors (e.g. IBM, Intel, AMD) for their specific platforms. A lot of them deal more with low level optimization issues. These documents are usually available for download from the vendor's website. Example: CRAY T3E Fortran Optimization Guide, SG-2518 3.0 at http://www.cray.com/swpubs/dynaweb/docs/titles.html

-- Norbert

Intel_C_Intel · ‎07-22-2002

Just a short suggestion, sometimes vast improvements can be made by copying all the files used by the program to the local machine, thus reducing file accesses over a network.

isn-removed200637 · ‎07-22-2002

0. Decide by what factor reduction in computing time for a given problem is desired.
1. You must do some tests to find out where your program spends most of its time. Prioritise the work on those sections of code that take the most time.

2. Since execution of every instruction takes time, the only way to improve is to reduce the number of instructions that need to be executed. If you want to reduce the time taken by factors rather than percentages, then large amounts of instructions need to be removed, or large amounts of slow speed-storage access reduced.
3. Consider programming the most numerically intensive sections in assembly code if possible as you can then
usually omit a lot of padding that most compilers
automatically put in. This can pay dividends in speed as
the number of instructions used are reduced.
4.Examine code inside of nested DO or FOR loops and try and
first reduce the number of statements. This should reduce the number of coded instructions and the pay-off is multiplied by the depth of the loop and the number of times it/they are executed.
5. Use local variables to store instances of intermediate computations that are repeated elsewhere and just substitute the local variable in place of later instances.
At the assembly code level, you may be able to keep local variables
in high-speed memory and avoid lots of automatic store-value-at-storage-address and get-value-from-storage-address instructions that just waste time if you do not ever need the value again except in local computations.
6. Try and parallise if possible.
7. Use multiply instead of divide where possible.
8. use integer multiply instead of floating point multiply if possible.
9. use REAL*4 instead of REAL*8 where it will not compromise accuracy..
8. Minimise calls to trig, log and exponential functions as they could take 10-20 times the time taken to do a single multiply of two numbers.
9. All such intrinsic functions are programmed to give machine accuracy. If you find that a particular function or functions use a significant amount of time, consider the accuracy that you need and then consider writing your own version to give the accuracy that you want with a smaller number of statements/instructions. This can give a big payoff.

HTH

Intel_C_Intel · ‎07-22-2002

The first thing to do is find out if there is some
part of your program that is eating up most of the time.
Concentrate you efforts on that part. I was recently
surprised to find, buried in one of the deepest loops,
an internal write operation used to set up a status
variable that was only very occassionally printed.
When I got rid of the internal write, the routine in
question ran about 6 times as fast, and the overall
program sped up by 20%. I would never have suspected
this, and only found it my careful profiling of the
code.

Look for opportunities for parallelization. If your
code is running a large set of independent calculations,
split them up among several machines. The total CPU
time won't go down, but it will get done faster.

mdfischer · ‎07-22-2002

Thanks to everyone for their suggestions. The code seems to take the longest while it is in a set of nested loops. I will check that first, and some other ideas I have after reading all of your suggestions.

Thanks again.
Michele

Intel_C_Intel · ‎07-23-2002

Use array statements to replace loops.

For example, if you can replace:

Do i=1,100
a(i)=b(i)*c(i)
End Do

with

a=b*c

or, if you are only going across a portion of the dimension:

a(1:100)=b(1:100)*c(1:100)