We have a Platform Computing Cluster made up of 2 Dell R710 head nodes 1 master node for administration and provisioning the other for handling I/O to the SAN we have attached via 4Gb Fiber Channel. The compute nodes consist of 16 Dell m610 blades which are all diskless and provisioned through Platform Cluster Manager. All 18 systems are connected via Infiniband and GigE. They mount home folders and a model_output folder hosted on the SAN through IP over IB NFS mounts. There is also an NFS mount to a shared folder which is on the administration and provisioning node which is also done through IP over IB. We would like to install ICTCE in such a way that any user logged into the master node can use the compilers and other Toolkit resources the researchers will mostly be using Linear Algebra libraries and the FORTRAN compiler. The root account already has passwordless ssh access to any node in the cluster from the master node. So my questions are thus:
Should ICTCE be installed in the shared folder on the master node? Most of the compilation will be done directly on the master node and then the execution will be handled by Platform LSF to run on the diskless compute nodes.
Does ICTCE have to write anything to the diskless compute nodes? If so does anyone know how I can integrate this into the diskless image so that I don't loose these files upon reboot?
Does passwordless ssh need to be setup for all users or only the user performing the installation? If so does anyone know how I can go about doing this such that the configuration survives reboot of the compute nodes?
Are there any other concerns that I need to be aware of for this particular environment?
You may install ICTCE to a shared folder on the master node. Just keep in mind that shared libraries will be loaded from network resource at application run time.
Users need to have passwordless ssh setup to be able to use Intel MPI Library. It should not be a problem if users have their home directories on compute nodes mounted to an NFS share. In this case just configure ssh that way so you will be able to login from a node to itself without a password. For instance, for openSSH you will have in the ~/.ssh directory the secret key and a public key saved in the authorized_keys file.
When building a diskless image please consider proper DAPL configuration. Having a latest dapl package may be a good idea.
Does it answer your questions?
You may select two different installation modes. For the Intel Cluster Studio 2011 (ISC) you may select either Current node installation type or All cluster nodes installation type. Selection may be done by choosing the Change advanced options installer menu item. ISC (former ICTCE ) installer will attempt to write data to the remote nodes only if you select All cluster nodes installation type.
Once you have diskless systems your choice should be current node installation type. You need either install software to NFS share or incorporate it to the system image that you use for your diskless systems.