JupyterHub
by Elena Bianco 16/07/2024
JupyterLab is a web-based interactive interface for notebooks and code (e.g., Python, R, Julia), that can be used to use interactive jobs in python in the cluster OPH.
The Jupyter environment can be used on the OPH cluster by creating a specific container for its installation and configuration. JupyterHub can be launched on a cluster node using the job queue system, and the user can work interactively with Python notebooks using the allocated resources.
1. Create a container with Apptainer
The user can either create a new container from scratch or build it from an existing Docker image, as done here. The predefined Docker image only includes a baseline deployment of JupyterHub, but additional configuration can be specified by the user.
apptainer build --sandbox container_name/ docker://jupyterhub/jupyterhub
The –sandbox
option creates an editable directory that mimics the Linux-Ubuntu file structure, containing usual directories (home, tmp, media, etc.). Replace container_name
with the desired container name.
Find directories within the sandbox that are not writable and make them writable
find container_name -type d -not -writable -exec chmod +w {} +
Enter the container shell (from the path where the container is located).
apptainer shell --writable container_name
Check the version of installed packages (e.g., JupyterHub)
apptainer exec --writable container_name jupyterhub --version
Packages can be updated or installed manually within the container, if they are not already available in the baseline configuration.
pip install jupyter notebook jupyterlab
Use 'exit' to go back to the main shell.
2. Run Jupyter from containers
When using Jupyter for data analysis or any other task requiring computational resources, the user must first request and allocate resources on a compute node before launching the Hub. This can be done using the salloc
command from the cluster login node using the job queue system (the line below can be adjusted to specific job requests)
salloc -N 1 --cpus-per-task=1 --time=02:00:00 --mem=2G --constraint=blade
Once the resources have been allocated, enter the container shell and launch JupyterLab
apptainer shell --writable container_name jupyter lab --ip=0.0.0.0 --port=8000 --no-browser
In a separate terminal window, login to the cluster frontend, and then connect to the allocated node (the node is displayed after running the salloc
command). Here, 2010 is chosen as the local port.
ssh -N -L 2010:localhost:8000 username@node
In a new terminal window from the local machine, connect to the cluster with
ssh -N -L 2010:localhost:8000 username@137.204.50.71
Where 2010
is the chosen local port and 8000
is the remote port specified in the first terminal window.
Finally, navigate to localhost:2010
on the computer browser and a window should appear with the Jupyter interface. To authenticate, insert the token provided in the first terminal window.
Use ctlr+C
to end the connection and go back to the Apptainer shell.