Intel® DevCloud
Help for those needing help starting or connecting to the Intel® DevCloud
1642 Discussions

login-2 (Ubuntu 18) vs compute (Ubuntu 20) node OS version differences

User01
New Contributor II
1,202 Views

I was surprised to see that the login node (login-2) on DevCloud is running a different Linux distribution than the compute nodes.

 

login-2:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.3 LTS
Release: 18.04
Codename: bionic

 

s001-n004:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.3 LTS
Release: 20.04
Codename: focal

 

The documentation at Queue Management | Intel® DevCloud states:

 

"When you access the Intel DevCloud for oneAPI Projects through SSH, you will be connected to a login node with hostname login-2. On this node you can edit code and compile applications."

 

I suppose compiling on Ubuntu 18.04 will generally produce code that runs on Ubuntu 20.04. For code compiled on Ubuntu 20.04, it would not be expected to run on Ubuntu 18.04. Running two different releases seems like an odd setup.

  • Why is the login node running an older Linux distribution that the compute nodes?
  • Is DevCloud designed with the intent that most editing, compiling, and debugging would be done on a login node or on a compute node in an interactive job?
  • In addition, I would have expected DevCloud to be running Intel's own Clear Linux distribution. Ubuntu is fine, but I am curious why Intel would choose it over Clear Linux for this cluster. If any background is available on that design decision, it would be fun to know.
Labels (1)
1 Solution
AbhijeetJ_Intel
Moderator
1,173 Views

Hi,

Thank you for posting in Intel Communities.


Your observations are correct.


Majority of the compute nodes are Ubuntu 20.04 and cater to HPC & AI users

Approximately 15+ compute nodes are Ubuntu 18.04 to support legacy FPGA Arria cards that are not forward compatible with Ubuntu 20.04


As we deprecate the old FPGA cards, newer FPGA cards like Agilex will be on Ubuntu 20.04 and we will retire the legacy/old OS


DevCloud was designed with two modes in mind : batch(primary) & interactive(secondary). With 8000 users worldwide sharing compute resources, compute tasks are done in batch mode very efficiently.


Editing, compiling, and debugging can be done either via Jupyter Notebook or via VScode. After editing, larger compile jobs can be sent to compute nodes via batch submission.


Note that as it says login node is mainly for login and submitting batch jobs. NOT for compiling or debugging. Meanwhile those who desire compute node interactivity can use the interactive mode.


Ubuntu vs RH vs SUSE vs CentOS vs Clear is just another flavor of Linux. We moved from RedHat to Ubuntu...since it was well understood and supported by users/community around the world.

At some point is time we might decide to move to SUSE depending on HPC users or AI users desire it etc.


Regards

Abhijeet


View solution in original post

5 Replies
AbhijeetJ_Intel
Moderator
1,174 Views

Hi,

Thank you for posting in Intel Communities.


Your observations are correct.


Majority of the compute nodes are Ubuntu 20.04 and cater to HPC & AI users

Approximately 15+ compute nodes are Ubuntu 18.04 to support legacy FPGA Arria cards that are not forward compatible with Ubuntu 20.04


As we deprecate the old FPGA cards, newer FPGA cards like Agilex will be on Ubuntu 20.04 and we will retire the legacy/old OS


DevCloud was designed with two modes in mind : batch(primary) & interactive(secondary). With 8000 users worldwide sharing compute resources, compute tasks are done in batch mode very efficiently.


Editing, compiling, and debugging can be done either via Jupyter Notebook or via VScode. After editing, larger compile jobs can be sent to compute nodes via batch submission.


Note that as it says login node is mainly for login and submitting batch jobs. NOT for compiling or debugging. Meanwhile those who desire compute node interactivity can use the interactive mode.


Ubuntu vs RH vs SUSE vs CentOS vs Clear is just another flavor of Linux. We moved from RedHat to Ubuntu...since it was well understood and supported by users/community around the world.

At some point is time we might decide to move to SUSE depending on HPC users or AI users desire it etc.


Regards

Abhijeet


User01
New Contributor II
1,165 Views

Thanks for the clarification and background. That all makes perfect sense. The intended use of the login vs compute nodes is documented at https://devcloud.intel.com/oneapi/documentation/advanced-queue/ with the following:

 

"When you access the Intel DevCloud for oneAPI Projects through SSH, you will be connected to a login node with hostname login-2. On this node you can edit code and compile applications."

 

The "this node" reference in the second sentence refers to "login-2." Perhaps that documentation needs to be cleaned up. I am happy to use an interactive batch job on a compute node for compiling and debugging.

 

The documentation also reports:

 

"However, to run computational applications, you will submit jobs to a queue for execution on compute nodes. In this manner, you are sharing computing resources with your peers, however, other users cannot see your data or applications."

 

I interpret this to indicate that when I am on a compute node within an interactive or non-interactive batch node, the job does not have exclusive use of the node. This has important consequences for performance analysis. Should I expect node performance to vary based on the load from other users' processes that I cannot see? Are there any best practices for doing performance analysis on DevCloud to account for the invisible variable load that might be occurring from other users' jobs on the shared compute nodes?

0 Kudos
AbhijeetJ_Intel
Moderator
1,124 Views

Hi,


Thank you for the documentation feedback.

We have forwarded it to the concern team.

 

>> Should I expect node performance to vary based on the load from other users' processes that I cannot, see? 


You should not expect node performance to vary based on the other user because

the compute node is a dedicated node and once you submit a job using the qsub command, you will be having exclusive access to it.


Regards

Abhijeet


0 Kudos
AbhijeetJ_Intel
Moderator
1,069 Views

Hi,


Have all your queries been answered? Do you need any more help?

Please let me know if we can go ahead and close this case?

Regards


0 Kudos
AbhijeetJ_Intel
Moderator
995 Views

Hi,

 

We assume that your issue is resolved.

If you need any additional information, please post a new question as this thread will no longer be monitored by Intel


Regards

Abhijeet


0 Kudos
Reply