Dask is a flexible library for parallel computing in Python, that follows the familiar syntax of the existing PyData ecosystem. In this session, Richard Pelgrim will take you through the first steps so you can start using it.
What You’ll Learn
- Overview of dask – How it works and when to use it.
- Dask Delayed: How to parallelize existing Python code and your custom algorithms.
- Schedulers: Single Machine vs Distributed, and the Dashboard.
- From pandas to Dask: How to manipulate bigger-than-memory DataFrames using Dask.
- Dask-ML: Scalable machine learning using Dask.