Tutorials and examples

Read the section on Allocating Resources before reading this page. Jobs can be submitted from any submit node – currently they are lunchbox.stat.wisc.edu and jetstar.stat.wisc.edu.

Interactive Jobs

Run an interactive MATLAB job on a node with 2 gigs of memory

Never run computation on the submit nodes. Instead, you can run programs on the nodes of the cluster and work interactively in order to test or debug your work. At the command line on the submit node the command to run is srun.

srun --pty --mem-per-cpu=2000M /workspace/software/bin/matlab

Run an interactive R session on a node and use 2 gigs of memory


srun --pty --mem-per-cpu=2000M /workspace/software/bin/R

All interactive jobs run with only 50 megabytes of memory (default) unless otherwise specified with the --mem-per-cpuoption>

Compile a Program and use 4 CPUs

You can open a shell on a node in order to perform compiling tasks. Never compile on the submit node. For large compiling jobs you can submit them as a regular batch job using sbatch. When you open a bash shell on a node you will still have access to your /workspace/[user] directory for files and data. Lots of luck compiling your software.

srun --pty -n 4 /bin/bash

Batch Jobs (Submit to the queue)

R

Submit a single R job to the default “debug” partition with default time and memory allocations


#!/bin/bash
#SBATCH --mail-user=user@stat.wisc.edu
#SBATCH --mail-type=ALL
module load R/R-3.5.2
R CMD BATCH --no-save test.R output.Rout

The example above will submit to the “debug” partition since we did not specify #SBATCH -p and will have a hard time limit of 4 days. The one CPU that is allocated will have only 50 megabytes of memory to work with.

Submit a single R job and specify your own R library for packages


#!/bin/bash
#SBATCH --mail-user=user@stat.wisc.edu
#SBATCH --mail-type=ALL
#SBATCH --mem-per-cpu=800M
module load R/R-3.5.2
export R_LIBS=/workspace/[user]/[myRlibrary]
R CMD BATCH --no-save test.R output.Rout

The example above sets the environment variable R_LIBS using the export command. You can set any environment variables you wish in your submit scripts. Order matters, and sometimes the module load line will set various environment variables for you. Be careful as to how you set your environment variable order as to not overwrite something you are expecting to be already set. The job is set to run with 800 MB of memory.

Submit an R job that is programmed to spawn 4 total R processes and run for 5 days


#!/bin/bash
#SBATCH --mail-user=user@stat.wisc.edu
#SBATCH --mail-type=ALL
#SBATCH -p long
#SBATCH -t 5-00:00:00
#SBATCH -n 4
#SBATCH --mem-per-cpu=2600M
module load R/R-3.5.2
R CMD BATCH --no-save test.R

The example above assigns 4 CPUs with -n 4 and the time to run is 5 days with -t 5-00:00:00. It also sets the partition to ‘long’ using -p long because we need to run longer than 4 days (‘short’ partition time limit). It reserves 2.6 gigabytes of memory, the current allowed maximum, for each CPU, for a total of 10.4 gigs of memory for the whole job.

Submit a single R job that is programmed to use 8 threads


#!/bin/bash
#SBATCH --mail-user=user@stat.wisc.edu
#SBATCH --mail-type=ALL
#SBATCH -p short
#SBATCH -t 24:00:00 # 24 hours
#SBATCH --cpus-per-task=8
#SBATCH -n 1
#SBATCH --mem-per-cpu=1000M
module load R/R-3.5.2
R CMD BATCH --no-save test.R

The example above allocates 1 CPU for the task with -n 1, but assigns 8 total CPUs with --cpus-per-task=8 so that each thread can run on its own CPU. If we did not set --cpus-per-task=8 then all threads would attempt to run on 1 allocated CPU, resulting in an 800% usage of the one processor – which would be slow. Always remember to set --mem-per-cpu.

Submit a R job that uses 48 single threaded jobs and using the MKL library


#!/bin/bash
#SBATCH --mail-user=user@stat.wisc.edu
#SBATCH --mail-type=ALL
#SBATCH -p short
#SBATCH -t 2-00:00:00
#SBATCH --cpus-per-task=48
#SBATCH --mem-per-cpu=900M
module load R/R-3.5.2
R CMD BATCH --no-save test.R

In the above example we ask for 48 CPUs with --cpus-per-task=48 and there is only one, multi-threaded task being executed. Remember that there must be something in your R code that actually generates 48 R processes, like a loop or special package that you’re using. If your job really only runs one R process, asking for 48 CPUs will not do anything except reserve 48 CPUs but only use one CPU for processing. You MUST have a need for all 48 based on what you have invoked using R. 48 is a special number as it is the maximum number of cores on any node in the cluster. You cannot exceed this number unless you are running some type of MPI or other parallel application.

Force a multi-threaded R job to use 8 threads per task


#!/bin/bash
#SBATCH --mail-user=user@stat.wisc.edu
#SBATCH --mail-type=ALL
#SBATCH -t 2:00:00
#SBATCH -n 5
#SBATCH --cpus-per-task=8
#SBATCH --mem-per-cpu=2500M
OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
module load R/R-3.5.2
R CMD BATCH --no-save test.R

In the example above we reserve 5 CPUs with -n 5 because our R code generates 5 R processes/tasks, but because we are using a version of R that was built using OpenMP and because or job will exhibit multi-threaded execution at some point, it will attempt to expand to a full 48 threads (max on the node) during multi-threaded portions of the job run. We could set –cpus-per-task=48 in order to reserve all the threads/cpus on that node, but that is usually excessive and bad etiquette. Instead, we can force each R process to have 8 threads (or something less than 48) by stating OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK. The total allocation of this submit script will be 40 cpus (-n * –cpus-per-task).

Job Arrays

Job arrays allow you to submit many iterations of the same job. It is used instead of typing sbatch submit.sh 100 times in the cases in which you would want to do that. Instead, a job array allows you to submit with sbatch one time but will in turn launch many jobs into the queue. There are ways to specify unique input files for each job in the array using BASH scripting and environment variables.

Submit an array of Julia jobs to the queue


#!/bin/bash
#SBATCH --mail-type=ALL
#SBATCH --mail-user=user@stat.wisc.edu
#SBATCH -o snaq/onesnaq_%a.log
#SBATCH -J onesnaq
#SBATCH --array=0-239
#SBATCH -p long
#SBATCH --mem-per-cpu=500M
# use Julia packages in /worskpace/, not defaults in ~/.julia/ (on AFS):
export JULIA_PKGDIR="/workspace/[user]/.julia"
# launch Julia script, using Julia in /workspace/ again, with full paths:
echo "slurm task ID = $SLURM_ARRAY_TASK_ID"
/workspace/software/julia-0.5.0/julia /workspace/[user]/timetest/onesnaq.jl 1 30 $SLURM_ARRAY_TASK_ID

The example above does not load anything with the module load command so instead we are specifying full directory paths to our programs and files so you can tell SLURM explicitly where things are and for clarity of this example. First, we are setting up separate job output files for each job in the array with -o snaq/onesnaq_%a.log. Note that the directory “snaq/” must exist in your working directory, so make it first. Next we name the job with -J onesnaq. Then we specify the number of jobs in the array with --array=0-239 for a job array size of 240 jobs. We are setting a special location to look for Julia packages in our working directory with export JULIA_PKGDIR="/workspace/[user]/.julia". The line echo "slurm task ID = $SLURM_ARRAY_TASK_ID" is there to output the job array number to the screen for monitoring/troubleshooting purposes. Finally, the actual execution command takes $SLURM_ARRAY_TASK_ID environment variable as an input to set an array of parameter values. You can use SLURM’s list of environment variables that exist at runtime to do lots of useful tasks. See a full list of SLURM environment variables by running srun printenv | grep SLURM at the command line. The value these variables are set to change based on the directives you set in your SLURM batch submit file.

Launching an array of R jobs with unique inputs


#!/bin/bash
#SBATCH --mail-user=user@stat.wisc.edu
#SBATCH --mail-type=ALL
#SBATCH -p short
#SBATCH -t 2-00:00:00
#SBATCH --array=1-250
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=900M
module load R/R-3.5.2
R CMD BATCH --no-save test.R myData.$SLURM_ARRAY_TASK_ID

This simple case has test.R looking for a data file in the same directory with names like myData.1 myData.2 myData.3
myData.4
 and so on up until the end of the array range, in this case 250. We can use the environment variable $SLURM_ARRAY_TASK_ID set by SLURM when using an arrays in order to increment through the file names. This requires you to name your files using sequential numbers. This is the simplest scenario involving multiple data files.

Launching an array of R jobs with unique inputs read from a single file


#!/bin/bash
#SBATCH --mail-user=user@stat.wisc.edu
#SBATCH --mail-type=ALL
#SBATCH -p short
#SBATCH -t 5:00:00
#SBATCH --array=0-24
#SBATCH --mem-per-cpu=2048M
module load R/R-3.5.2
ROW=`expr $SLURM_ARRAY_TASK_ID + 1`
export TOKENID=`awk -v line=$ROW '{if(NR==line)print $1}' twtdf2.txt`
export RANGESTART=`awk -v line=$ROW '{if(NR==line)print $2}' twtdf2.txt`
export RANGEEND=`awk -v line=$ROW '{if(NR==line)print $3}' twtdf2.txt`
Rscript 3ktweets2.R $TOKENID $RANGESTART $RANGEEND

This example requires us to use more bash scripting to set up some new environment variables. The file ‘twtdf2.txt’ has many rows of three columns of numbers, for example, 227 27301 27400. We want our R code to cycle through the file, one row at a time, and assign each number to a meaningful variable that our R code takes as inputs. We use the awk command to read out individual columns and we make the environment variable ROW increment using the $SLURM_ARRAY_TASK_ID. We initially set ROWto start at 1 since our array started at 0, and awk assumes the first row starts with 1. This can be avoided if we just started our array with 1 instead of 0. Or if you are more clever you can just use $SLURM_ARRAY_TASK_ID in your awk command so it knows which row to pull from. There are many ways to use $SLURM_ARRAY_TASK_ID so you can get very creative.

Python Jobs and Environments

Simple Python job using conda


#!/bin/bash
#SBATCH --mail-user=user@stat.wisc.edu
#SBATCH --mail-type=ALL
#SBATCH -p long
#SBATCH --mem-per-cpu=1800M
module load python/miniconda
source activate myEnv
python code.py

Python environments can become complicated when using very specific packages and specific versions of those packages. In order to ensure
that your python application runs successfully each and every time, and does not interfere with package and python versions needed by other users of the cluster, we use conda to manage individual user environments. The above example loads the conda path information with module load python/miniconda and then you can activate your pre-built environment with source activate myEnv where myEnv is the name of your pre-built conda environment which contains your specific python and packages/modules. Finally we execute your python code with python code.py where code.py is your file containing your python code.

You must pre-build your conda environment. Please see the Software and Paths section of this user’s guide for more information. When on the submit node create your environment with module load python/miniconda and then run the command conda create -p /workspace/[username]/[some_env_name] python=x.x [some_package} where -p is the path to your working directory and the name of your environment followed by a specific python version and a space delimited list of any packages you’ll need. It will create this directory for you and use that directory name as your conda environment name. You can learn more about conda create and how to install various versions of pythons and packages by reading the online documentation from Anaconda. https://conda.io/docs/_downloads/conda-cheatsheet.pdf. You can also use module load python/anaconda and create environments that import all python packages – which is large. (conda create -p /workspace/[username]/myEnvironment anaconda)

MPI Jobs (HPC)

Run a simple MPI job using 96 cores

NOTE: openmpi jobs will run across the 40gb/s Infiniband networks instead of the traditional 10gb/s TCP stack, greatly increasing communication speeds and bandwidth for your HPC job.

#!/bin/bash
#SBATCH -p long
#SBATCH -n 96
#SBATCH --ntasks-per-node=48
#SBATCH -N 2
#SBATCH --qos=unlimitedcpu
#SBATCH --mem-per-cpu=1800M
module load openmpi/mpi-4.0.0
ulimit -l unlimited
mpirun mpi_hello_world

NOTE: You must have your program compiled with mpicc provided on the cluster. You can use module load openmpi/mpi-4.0.0 at the command line (as opposed to the SLURM submit script) to set MPICC=[path to the cluster's mpicc]environment variable. Lots of luck compiling your MPI program first.

The above example requests 96 cores with -n 96 which is more core than are available on any one node (there are 48 “cpus” or 24 cores with 2 threads each on every node. See the section titled Allocating Resources for more information on how SLURM defines cores, CPUs, and threads). It splits up the jobs by doing some division with setting --ntasks-per-node=48. We have to set the “Quality of Service” to --qos=unlimitedcpu in order to get more cpus than the hard limit. You must request access from the lab to the unlimitedcpu QoS.Only use the unlimitedcpu qos when running mpi jobs and consulting with the lab. Any abuse of the unlimitecpu qos will result in your job being canceled. Finally, we set the amount of memory needed for each CPU allocated to the task. In this case, 1.8 gigs.

Requesting GPUs when using CUDA tools

Submit a Python job that requires 1 GPU device


#!/bin/bash
#SBATCH --mail-user=user@stat.wisc.edu
#SBATCH --mail-type=ALL
#SBATCH -p gpu
#SBATCH --gres=gpu:prod:1
#SBATCH --mem=2G
#SBATCH -D /workspace/user
#SBATCH -c 4
mycondaenv/bin/python code.py --outpath output --seed 0 --cuda 0 --numworkers 3

In this example, we set the partition with -p to gpu to get the GPU node on the cluster. We request 1 production (as opposed to debug) GPU with the directive --gres:gpu:prod:1. To grab a single debug GPU specify debug as the gpu type with --gres=gpu:debug:1. Total memory for this job is 2 gigabytes, --mem=2G. We set our working directory with -D so that we can use relative paths in our execution line of the script. Lastly, we request 4 CPUs to go with our GPU job for subprocessing. We request 4 cpus because our Python job --numworkers is set to 3, and there is one parent process along with them, for a total of 4 CPU processes. Be mindful of CPU requests and do not request too little, or too much for your GPU job. If one GPU job uses all the CPU on the node, no other GPU job can run on the node. There are currently a total of 8, NVIDIA RTX 2080ti devices in total.

Depending on what tools you are using to work with CUDA and GPUs, your syntax for the actual execution may differ. In this example, and in many other tools, the --cuda option is set to 0, which is not an absolute number that refers to the device. Instead, you are asking for cuda device ‘0’ which will be the first GPU device available, which could be GPU 0,1,2,3,4,5,6 or 7.

If you have specific GPU questions please consult with the lab to get your job running efficiently.