We are involved in setting up an HPC cluster with about 25 Dell PowerEdge 720 servers, each equipped with 172 GB of RAM and 24 Intel cores running at 2,4 GHz). Every node is connected to a Gigabit Ethernet switch and to a 56 Gbps Mellanox Infiniband switch that provides storage access.
While we are planning the conventional HPC use (CentOS 7.1, Slurm, etc.), several researchers ask us to reserve some servers to let them setup small testbeds involving several VMs (possible scattered across more than one physical servers) connected as they define (by the way of virtual networking and the Ethernet physical switch). This challenge raises us the following two questions:
- Are there some open source solution, ready to use, that let me manage how different researchers can use the resources available (CPU, RAM, network bandwidth), enforce a strict schedule for the researchers involved, and gives researchers some freedom on setting up their own environment (VMs, VM connections, etc.? (They need the freedom to access the VMs and run the interactive GUI frontend of their preferred tools). This resembles a cloud profile but we need to allocate a specific timetable to the resource usage per user in advance to enable resource sharing.
- Are there some open source solutions that enable us to dynamically allocate physical servers to the conventional HPC usage and to the profile described on the previous point?
We will appreciate very much some directions to look for solution to such unusual requirements.