- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What I mean is that I can use the feature of allocate to stop the system running.
For example, my machine has 62G memory.
Then I define an array allocatable in fortran. If i allocate the array for 62G, the allocate operation will failed. Only when I allocate 60G or less, it can be success.
And the allocate function doesn't allocate the memory for the array unless I initilize the array.
So I define two allocatable arrays, and allocate them both by 60G. The allocate operation succeed.Then I valued them one by one. My computer went down.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think it's a quite bad desgin. It means everyone can destory the system
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The computer may still be running, very slowly due to page faults. This can occur when your Virtual Memory requirements exceed the Physical Memory capacity. Some programs can run fine in this situation when they tend to process data in proximity of data that is already loaded into RAM. For example, manipulating a large matrix using tiles (small sections of the large array). If you are trying to perform a matrix multiply of these overly large arrays, then you must use a tiling strategy by partitioning the large matrices into smaller pieces (tiles), perform a portion of the matrix multiply with the tiles, then accumulate (reduce) the data. See:
(or Google search: tiled matrix multiplication)
Even if you are not performing matrix multiplication, the tiling strategy use in these examples is what you should consider using when your Virtual Memory requirements exceed your Physical Memory capacity. Tiling is also helpful for improving cache hit ratios even when all your program + data fit in physical memory.
You might try starting a second terminal window first, that you can use to kill the process run from the problem program. If you really need to manipulate this amount of data, figure out how to organize the program such that it can work with smaller pieces of the total data.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I got your idea. It must be an useful strategy.
To my situation, I run this test code on my slave node, and now I can't login it. What I worry about is that if someone else in my cluster writes some code like mine in accident, he will make the node down. And no one can fix it, unless restarting the machine.
And if I can't allocate 62G in a 62G node, why can I allocate two 60G arrays. Why the fortran uses a count to remember how many memories have been used.
jimdempseyatthecove wrote:
The computer may still be running, very slowly due to page faults. This can occur when your Virtual Memory requirements exceed the Physical Memory capacity. Some programs can run fine in this situation when they tend to process data in proximity of data that is already loaded into RAM. For example, manipulating a large matrix using tiles (small sections of the large array). If you are trying to perform a matrix multiply of these overly large arrays, then you must use a tiling strategy by partitioning the large matrices into smaller pieces (tiles), perform a portion of the matrix multiply with the tiles, then accumulate (reduce) the data. See:
(or Google search: tiled matrix multiplication)
Even if you are not performing matrix multiplication, the tiling strategy use in these examples is what you should consider using when your Virtual Memory requirements exceed your Physical Memory capacity. Tiling is also helpful for improving cache hit ratios even when all your program + data fit in physical memory.
You might try starting a second terminal window first, that you can use to kill the process run from the problem program. If you really need to manipulate this amount of data, figure out how to organize the program such that it can work with smaller pieces of the total data.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I typically use my Linux system for engineering studies. I do not perform system management functions.
The problem you are describing is a system management problem. Most operating systems have system management settings that you (the system manager) can control to set quotas. One such quota that is typically available is the maximum amount of virtual memory any process can use.
>> I run this test code on my slave node, and now I can't login it.
This may be the case that the system administration code or system administrator noted that your program (your account) has cause problems and has suspended your privileges to use the system. Contact the system administrator to work out a resolution.
Your application should not bring down a cluster node. It sounds like your system administrator has not setup the node properly. What should have happened is you application should have been terminated. On Linux (Unix) there is typically an administration utility, called oom, which stands for Out Of Memory. This utility is designed to kill processes that exceed their quotas. It is likely that this program is not being run or is not configured properly.
Jim Dempsey

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page