Hi,
I recently upgraded my laptop and I have a Dell Precision M65 with Intel Core Duo processor now. I am using Compaq Visual Fortran 6.6a with MS Visual Studio 6.0 for development. I tranferred all my code/project to this new machine and the project opens and code compilation goes fine just like on my other machine. The problem started when I tried to run the program, it did not execute and crashed as soon as I initiated the run. I got 157 access violation error.
After 2 days of struggle, I found that the problem was caused by the stack size I specified in linker settings. Tt seems it was too large for this machine (???). Since my code is basically a simple number crunching simulation engine which has to deal with large number of nodes (cells), I specified reserve stack size of 500 MB so that it never complains about stack overflow when used for big problems (This might not be the right thing to do). After reducing the stack size to around 350MB, program runs fine.
So here are my questions,
1. Why can't I run the executable compiled with 500MB stack size on this new laptop with Core Duo. Same executable runs fine on all other machines (P4, P3, Centrino, Pentium D, etc). If I reduce the reserve stack size to 369MB, it runs fine but crashes when stack size is specified 370MB or above. None of the other machines in our office have this problem. Other members in our team have .NET with IVF and even the code compiled on their machines with 500MB stack does not run on this laptop. What is so special about core duo?
2. How does the reserve stack size work? If I specify 500MB then does the OS reserves that much amount of memory to be used only be stack? Can other allocatable arrays not acces this memory?
3. Is there a way to find out the max stack size used by an program? This may give us an insight about the size of stack required by our program for different problem sizes.
I tried the same program with 500MB stack on other new core duo laptops we got in our office and they exhibit the same behavior.
Thanks,
Ravi
I recently upgraded my laptop and I have a Dell Precision M65 with Intel Core Duo processor now. I am using Compaq Visual Fortran 6.6a with MS Visual Studio 6.0 for development. I tranferred all my code/project to this new machine and the project opens and code compilation goes fine just like on my other machine. The problem started when I tried to run the program, it did not execute and crashed as soon as I initiated the run. I got 157 access violation error.
After 2 days of struggle, I found that the problem was caused by the stack size I specified in linker settings. Tt seems it was too large for this machine (???). Since my code is basically a simple number crunching simulation engine which has to deal with large number of nodes (cells), I specified reserve stack size of 500 MB so that it never complains about stack overflow when used for big problems (This might not be the right thing to do). After reducing the stack size to around 350MB, program runs fine.
So here are my questions,
1. Why can't I run the executable compiled with 500MB stack size on this new laptop with Core Duo. Same executable runs fine on all other machines (P4, P3, Centrino, Pentium D, etc). If I reduce the reserve stack size to 369MB, it runs fine but crashes when stack size is specified 370MB or above. None of the other machines in our office have this problem. Other members in our team have .NET with IVF and even the code compiled on their machines with 500MB stack does not run on this laptop. What is so special about core duo?
2. How does the reserve stack size work? If I specify 500MB then does the OS reserves that much amount of memory to be used only be stack? Can other allocatable arrays not acces this memory?
3. Is there a way to find out the max stack size used by an program? This may give us an insight about the size of stack required by our program for different problem sizes.
I tried the same program with 500MB stack on other new core duo laptops we got in our office and they exhibit the same behavior.
Thanks,
Ravi
链接已复制
6 回复数
The processor type is not relevant here. More important is the amount of RAM and disk space avaailable for the swap file. When you specify 500MB for stack, the linker reserves that much virtual memory for the stack and it can't be used for any other purpose including allocatable arrays.
I am not aware of any way to determine programmatically what the maximum stack space a program uses.
I am not aware of any way to determine programmatically what the maximum stack space a program uses.
Steve,
Thanks for the reply. I agree that processor type should not be an issue but I can't understand what is going wrong here.
My laptop has 2 GB of RAM, 100 GB hard disk (with more than 75GB free space) and paging file size is set to 2046MB. I think memory and disk space is more than enough. I have run this program on other machines with much less RAM (512MB). What else can be the reason?
When run in debug mode, I get these two unsual messages in the ide debug window:
1. First-chance exception in Program.exe (NTDLL.DLL): 0xC0000005: Access Violation
2. First-chance exception in Program.exe (MPICH2D.DLL): 0xC0000005: Access Violation
and the program aborts with message forrtl: severe (157): Program Exception - access violation.
Can you provide some direction on how to find the cause of this problem.
Ravi
Thanks for the reply. I agree that processor type should not be an issue but I can't understand what is going wrong here.
My laptop has 2 GB of RAM, 100 GB hard disk (with more than 75GB free space) and paging file size is set to 2046MB. I think memory and disk space is more than enough. I have run this program on other machines with much less RAM (512MB). What else can be the reason?
When run in debug mode, I get these two unsual messages in the ide debug window:
1. First-chance exception in Program.exe (NTDLL.DLL): 0xC0000005: Access Violation
2. First-chance exception in Program.exe (MPICH2D.DLL): 0xC0000005: Access Violation
and the program aborts with message forrtl: severe (157): Program Exception - access violation.
Can you provide some direction on how to find the cause of this problem.
Ravi
That's probably not a stack issue. Are you building MPICH yourself? If so, and you build the DLL with debug information, you can have the debugger stop at the point of error and see what is going on. Does the problem happen with serial code? Maybe it is showing up only when two or more threads are in use and revealing a coding error.
I am not building MPICH myself. I used the mpich2-1.0.1-win32 installer to install it. MPICH Libraries that I am using are mpich2d.lib and fmpich2sd.lib and I think here 'd' is for debug version. I tried using mpich2.lib and fmpich2s.lib (optimized versions??) and it still does not run.
Although the code has mpich calls in it, I am actually running it in single processer mode.
Ravi
Although the code has mpich calls in it, I am actually running it in single processer mode.
Ravi
I just now installed latest version of mpich2 (1.0.3) and linked the new libraries mpich2.lib and fmpich2s.lib (This version does not provide mpich2d.lib and fmpich2sd.lib). Program still does not run but I get more detailed message. Here it is.
job aborted:
process: node: exit code: error message:
0: localhost: 13: Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(225)........: Initialization failed
MPID_Init(81)................: channel initialization failed
MPIDI_CH3_Init(35)...........:
MPIDI_CH3I_Progress_init(305):
MPIDU_Sock_listen(903).......: unable to create a socket
easy_create(140).............: WSASocket failed, A system call that should never
fail has failed. (errno 10107)
Any suggestions on this? Also, program runs if I specify less stack size (say 300MB).
Ravi
job aborted:
process: node: exit code: error message:
0: localhost: 13: Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(225)........: Initialization failed
MPID_Init(81)................: channel initialization failed
MPIDI_CH3_Init(35)...........:
MPIDI_CH3I_Progress_init(305):
MPIDU_Sock_listen(903).......: unable to create a socket
easy_create(140).............: WSASocket failed, A system call that should never
fail has failed. (errno 10107)
Any suggestions on this? Also, program runs if I specify less stack size (say 300MB).
Ravi