This training will introduce you to the key concepts and tools for parallelization. We will discuss how to submit jobs to run in parallel on Savio, the campus high performance computing cluster, and how to monitor those jobs to check if they are doing what you expect. We will show examples of submitting parallel jobs using external software and running many small jobs in parallel in one large job that runs across one or more nodes. We will also discuss some of the tools in Python, R and MATLAB that allow you run code in parallel.
Note that this is not a training on how to write parallel code (e.g., using MPI or OpenMP). We will assume that you know (or will learn) how to write parallel code in the software of your choice or are using software that can run in parallel (e.g., some bioinformatics software).
Topics will include:
- General principles and terminology related to parallel processing
- Submitting and monitoring parallel jobs on Savio
- Parallelization using existing software (e.g., bioinformatics software or simulation software from the physical sciences/engineering)
- Tools to run many small jobs in parallel on one or more nodes (ht_helper, GNU parallel)
- Parallelization in Python, R, and MATLAB