This is a multi-container Slurm cluster using docker-compose. The compose file creates named volumes for persistent storage of MySQL and Postgres data files as well as Slurm state and log directories.
The compose file will run the following containers:
- mysql
- postgres
- slurmdbd
- slurmctld
- c1 (slurmd)
- c2 (slurmd)
- rstudio1
- rstudio2
The compose file will create the following named volumes:
- etc_munge ( -> /etc/munge )
- etc_slurm ( -> /etc/slurm )
- slurm_jobdir ( -> /data )
- var_lib_mysql ( -> /var/lib/mysql )
- var_libpostgres ( -> /var/lib/postgres )
- var_log_slurm ( -> /var/log/slurm )
- home ( -> /home )
- var_lib_rstudio_server ( -> /var/lib/rstudio-server )
The dependency is shown in the image below.
A more detailed view contains the mounted volumes as well.
The setup uses one single docker image named slurm-docker-cluster. You can build this directly via docker-compose
docker-compose build which will build the slurm-docker-cluster using default values for the versions of RStudio Workbench (2024.09.0+375.pro3) and SLURM (23.11.3) and for Ubuntu 22.04 LTS (Focal).
If you wanted to use a different Posit Workbench and SLURM version (or a different Ubuntu LTS version), you can set the environment variables PWB_VERSION, SLURM_VERSION, DIST and DISTNUMto your desired Workbench and SLURM version. e.g.
export PWB_VERSION="2023.09.0-daily-203.pro2"
export SLURM_VERSION="23.02.3-1"
export DIST="jammy"
export DISTNUM="2204"Run docker-compose to instantiate the cluster:
docker-compose up -dNote: Make sure you have the environment variable RSP_LICENSE set to a valid license key for Posit Workbench.
Once the cluster is up and running, RSWB is available at http://localhost:8788 and http://localhost:8789
Use docker-compose exec to run a bash shell on the controller container:
docker compose exec slurmctld bashFrom the shell, execute slurm commands, for example:
[root@slurmctld /]# sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
normal* up 5-00:00:00 2 idle c[1-2]The slurm_jobdir named volume is mounted on each Slurm container as /data.
Therefore, in order to see job output files while on the controller, change to
the /data directory when on the slurmctld container and then submit a job:
[root@slurmctld /]# cd /data/
[root@slurmctld data]# sbatch --wrap="uptime"
Submitted batch job 2
[root@slurmctld data]# ls
slurm-2.outdocker-compose stop
docker-compose startor for restarting simply
docker-compose restartTo remove all containers and volumes, run:
docker-compose down
docker volume ls | grep slurm-docker-cluster | \
awk '{print $2}' | xargs docker volume rm 
