LEN_TRIM performance issue

olof_liungman · ‎12-19-2006

Hi!

I have run the "gprof" profiling tool on a numerical code for calculating particle movement in a fluid flow.The code isFortran90-standard and I used -pg and then -pg -O3 when compiling. To my surprise the profiling revealed that about 17-18% of execution time is spent in each of LEN_TRIM and CPSTR.

We use LEN_TRIM quite extensively (about 100-200 instances in the code) for extracting control parameters, file names, etc. from character constants. However, I certainly did not expect this to hace an impact on performance.

Is this a known performance issue? Could I be mistaken?

Thanks,

Olof

Steven_L_Intel1 · ‎12-19-2006

I have not been aware of LEN_TRIM as a specific performance issue, but by its nature it's not going to be all that fast. I'm puzzled that you'd have calls to LEN_TRIM in the compute-intensive section of the code. Typically, parsing of commands is done just once at the beginning of execution. 100-200 calls on LEN_TRIM is a lot in my view. What kind of program structure leads to that? If you're using character constants, why not use LEN() instead?

olof_liungman · ‎12-20-2006

As I wrote earlier the code is a particle tracking model. It is used in an operational oil spill forecasting system. Because we wanted to write a general, modular and extensible code the particles may represent many different substances or objects. Depending on what the particles represent different processes are activated. Many of these processes need to be calculated within time loops. For example, to decide whether a subroutine describing a particular process should becalled, we check the type of substance. Of course, there are wys around this, such as parsing the string input in the beginning of the simulation and replacing the strings with several different logicals.

Since we donot know at compile time what a particular string will contain (e.g. a file path) we define long character variables and then strip out the blanks at run time when we want to, e.g., concatenate a file path and a file name. Thus LEN is not really an option.

It should be said that out code is not optimized for speed, but for clarity. Thus I am sure we can do a lot to increase performance. From answers to my post at comp.lang.fortran (Google group) it seems that LEN_TRIM is indeed slow in certain cases. Nevertheless, it seems strange that it would take up such a large part of the actual cpu time.

Steven_L_Intel1 · ‎12-20-2006

Hard to say. LEN_TRIM is an inherently slow operation compared to typical mathematical computations. It does byte-at-a-time memory access. Depending on how long the variables are, it may spend a lot of time looping over blank characters. It's not the sort of thing I'd expect to see in a compute-intensive section of an application.

Perhaps what this really says is that the computational part of your application is relatively minor?

jimdempseyatthecove · ‎12-20-2006

I suggest you create a type for strings which contains a length and you string buffer. When you load or modify the string you perform the LEN_TRIM on the string and store the result in the length member. i.e perform LEN_TRIM only once per unknown string. You can get fancy and make operator functions for your type such that the LEN_TRIM occures automatically when you modify the string.

joseph-krahn · ‎12-21-2006

I have found that almost all string operations are surprisingly slow for many compilers. Strings have always been a very low priority in Fortran, so there just has not been a lot of effort to optimize them. I find that working with strings as INTEGER*1 arrays can easily give a 10-fold speed increase to get the same result (maybe that's why Intel has the BYTE type?)

Proper filename support should use PATH_MAX length strings, which can be >4000. LEN_TRIM has to check through all of those blanks, and string copying has to blank-pad all of the destination strings. The best performance approach is to store the length of each string, and only use the leading valid characters (i.e. don't blank-pad strings).

For example, this:
len_string1=len_string2
string1(:len_string1) = string2(:len_string2)

can be MUCH faster than:
string1=string2
len_string1=len_trim(string1)

If you are using strings as tokens, the strings should be converted into a numeric form (i.e. Atom in the computer-science sense). This lets you be more generalized by avoiding hard-coded enums, but get almost the same performance. If you can get by with 4 or 8 characters, a simple Hollerith conversion can be very efficient.