- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm encountering a repeatable memory error that goes away as I increase the number of processes. I'm thinking that there is some static allocation or other memory limit that is being hit, but having more processes spreads the needed memory for each process to eventually fit into that limit. So, I wanted to use GDB to track down where there memory error is cropping up in order to fix the code. (The overall use of memory is only in the single digit percents of what's available when the code cracks.)
Without the '-gdb' option, I can run an instance of the code in just over 1 second. If I add the debugger flag, after I type "run" at the (mpigdb) prompt, I wait and wait and wait. Looking at 'top' in another window I see the mpiexec.hydra process pop up with 0.3% of CPU every once in a while. For example,
[clay@XXX src]$ time mpiexec -n 2 graph500_reference_bfs 15 real 0m1.313s user 0m2.255s sys 0m0.345s
[clay@XXX src]$ mpiexec -gdb -n 2 graph500_reference_bfs 15 mpigdb: np = 2 mpigdb: attaching to 1988 graph500_reference_bfs qc-2.oda-internal.com mpigdb: attaching to 1989 graph500_reference_bfs qc-2.oda-internal.com [0,1] (mpigdb) run [0,1] Continuing. ^Cmpigdb: ending.. [mpiexec@XXX] Sending Ctrl-C to processes as requested [mpiexec@XXX] Press Ctrl-C again to force abort [clay@XXX src]$
Do I need to just be more patient? If the real problem test case takes almost 500 seconds to reach the error point, how patient do I need to be? Or is there something else I need to be doing different to get things to execute in a timely manner? (I've tried to attach to one of the running process, but that didn't work at all.)
I was hoping to not need to resort to the most common debugger, the 'printf' statement, if I could help it. And using a debugger would elevate my skills in the eyes of management for me. :-)
Thanks.
--clay
Link Copied
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page