I’m pleased to announce the release of Dask version 1.1.0. This is a majorrelease with bug fixes and new features. The last release was 1.0.0 on2018-11-29.This blogpost outlines notable changes since the last release.
You can conda install Dask:
conda install dask
or pip install from PyPI:
pip install dask[complete] --upgrade
Full changelogs are available here:
A lot of work has happened over the last couple months, and we encourage peopleto look through the changelog to get a sense of the kinds of incrementalchanges that developers are working on.
There are also a few notable changes in this release that we’ll highlight here:
Both Numpy and Pandas have been evolving quickly over the last few months.We’re excited about the changes to extensibility arriving in both libraries.The Dask array/dataframe submodules have been updated to work well with theserecent changes.
In particular Dask Dataframe supports Pandas Extension arrays,meaning that it’s easier to use third party Pandas packages like CyberPandas orFletcher in parallel with Dask Dataframe.
For more information see Tom Augspurger’s post
For a while Dask array has had some high level graphs for “atop” operations(elementwise, broadcasting, transpose, tensordot, reductions), which allow forreduced overhead and task fusion on computations within this class.
y = da.exp(x + 1).T # These operations get fused to a single task
We’ve renamed atop to blockwise to be a bit more generic, and have alsostarted applying it to Dask Dataframe, which helps to reduce overheadsubstantially when doing computations with many simple operations.
This still needs to be improved to increase the class of cases where it works,but we’re already seeing nice speedups on previously unseen workloads.
The da.atop function has been deprecated in favor of da.blockwise. Thereis now also a dd.blockwise which shares a common code path.
We’re working to make Dask a bit more agnostic to the types of in-memory arrayand dataframe objects that it can manipulate. Rather than having Dask Array bea grid of Numpy arrays and Dask Dataframe be a sequence of Pandas dataframes,we’re relaxing that constraint to a grid of Numpy-like arrays and a sequenceof Pandas-like dataframes.
This is an ongoing effort that has targetted alternate backends likescipy.sparse, pydata/sparse, cupy, cudf and other systems.
There have been several releases since the last time we had a release blogpost.The following people contributed to the dask/dask repository since the 0.19.0release on September 5th:
The following people contributed to the dask/distributed repository since the 0.19.0release on September 5th: