- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We have a simulation program that suffers a mysterious slowdown after
running for some period of time, even though the calculations that are
being performed are identical throughout. We do not suspect a memory
leak, since the process memory footprint does not grow, but do suspect
it has something to do with variable (memory) access. Have you seen or
heard of such behavior before, and is there some solution?
Link Copied
5 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Some more technical details:
Our simulation code is writen in F90. We have used SunStudio dbx for memory check, and found no memory leak or access violation.
In fact, the same codes, when compiled with SunStudo and run on SPARC, shows no slow down throughout. Even more mysteriously, the same codes, when compiled with Intel ifort, but with /arch:sse2, shows no slow down.
In another word, we have isolated the problem solely with compiler option /arch:ia32 or -mia32 (on Linux). We are targetting realtime system, is having trouble to get sse2 enable code running on embedded RT system. At least at this moment, /arch:ia32 or -mia32 is the only way to make code run.
Does this /arch:ia32 do something bad?
Our simulation code is writen in F90. We have used SunStudio dbx for memory check, and found no memory leak or access violation.
In fact, the same codes, when compiled with SunStudo and run on SPARC, shows no slow down throughout. Even more mysteriously, the same codes, when compiled with Intel ifort, but with /arch:sse2, shows no slow down.
In another word, we have isolated the problem solely with compiler option /arch:ia32 or -mia32 (on Linux). We are targetting realtime system, is having trouble to get sse2 enable code running on embedded RT system. At least at this moment, /arch:ia32 or -mia32 is the only way to make code run.
Does this /arch:ia32 do something bad?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
/arch:ia32 should not be doing anything bad.
Do you have a profiler available?
Does the simulation consistantly slows down at T=n into the simulation?
If yes to both, then make two simulation runs with the profiler.
One run with sampling beginning at T=0 and run for n seconds (n T's).
A second run with sampling beginning at T=n and run to 2n seconds (2n T's).
Comparing the two profile runs should give you an idea what is happening in your program.
Jim Dempsey
Do you have a profiler available?
Does the simulation consistantly slows down at T=n into the simulation?
If yes to both, then make two simulation runs with the profiler.
One run with sampling beginning at T=0 and run for n seconds (n T's).
A second run with sampling beginning at T=n and run to 2n seconds (2n T's).
Comparing the two profile runs should give you an idea what is happening in your program.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, Jim,
We did use profiling tools for diagnosis of this problem. Among the tools we used: gprof on Linux, SunStudio analyzer, and Intel's own vtune on Windows.
The results and conclusion are presented in thread 1, i.e., with Sun SPARC and Intel SSE2, there is really no much difference between first several case runs and rest of the simulation. But with /arch:ia32, the difference between first 10 cases and rest of the case is really outstanding, basically the whole flat profiling result changed, and the whole simulation shows dramatic slow-down.
Obviously, we looked at the subroutines revealed in above profiling results. There should NOT be much difference at all, according to our developers. We used a lot of module data in some of the subroutines, though. Such behavior becomes very disturbing.
What could it be? Maybe those globales (modules data) has too many memory space segmentation due to alloc/realloc type of work? Any other possibilities? Why /arch:ia32 is the only one causing this?
I wonder if Steve Lionel can provide some pointers to these observation.
We did use profiling tools for diagnosis of this problem. Among the tools we used: gprof on Linux, SunStudio analyzer, and Intel's own vtune on Windows.
The results and conclusion are presented in thread 1, i.e., with Sun SPARC and Intel SSE2, there is really no much difference between first several case runs and rest of the simulation. But with /arch:ia32, the difference between first 10 cases and rest of the case is really outstanding, basically the whole flat profiling result changed, and the whole simulation shows dramatic slow-down.
Obviously, we looked at the subroutines revealed in above profiling results. There should NOT be much difference at all, according to our developers. We used a lot of module data in some of the subroutines, though. Such behavior becomes very disturbing.
What could it be? Maybe those globales (modules data) has too many memory space segmentation due to alloc/realloc type of work? Any other possibilities? Why /arch:ia32 is the only one causing this?
I wonder if Steve Lionel can provide some pointers to these observation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Since you say you are seeing a slowdown on Sun and with SSE2, that causes your original observation to no longer apply. The only thing that comes to mind is excessive pagefaulting as the program's virtual memory usage increases. Have you looked at the virtual memory usage of the application over time?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, Steve,
Sorry, I must have made a mistake in my previous statement to cause the confusion. (sometimes I would miss a 'NOT' during typing, appologize ;) ).
The fact is that we do NOT see any slow down with Sun and with SSE2. The slow down only happens when I use /arch:ia32 on Windows or -mia32 on linux (disabling the SSE2, essentually).
I will pass your comment back to our developers, we actually have a tool to measure the VM usage. Let me see if we can have more info on that.
Sorry, I must have made a mistake in my previous statement to cause the confusion. (sometimes I would miss a 'NOT' during typing, appologize ;) ).
The fact is that we do NOT see any slow down with Sun and with SSE2. The slow down only happens when I use /arch:ia32 on Windows or -mia32 on linux (disabling the SSE2, essentually).
I will pass your comment back to our developers, we actually have a tool to measure the VM usage. Let me see if we can have more info on that.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page