Dask allows you to set up parallel computations on one or more machines (or Savio nodes), including working with large datasets distributed across multiple Savio nodes. We'll cover the different ways to set up and run parallel computations using Dask.
Topics will include:
-Parallelizing loops using delayed evaluation
-Distributed data structures (including parallel I/O)
-Parallelization on one or more machines
-Using Dask in the context of SLURM job submissions
-Random number generation
-Nested parallelization, memory use, and load-balancing
After the training, we'll have an informal get together with snacks and drinks.