Submit New Event

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Submit News Feature

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Contribute a Blog

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Sign up for Newsletter

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Sep 27, 2018

Refactor Documentation

By

This work is supported by Anaconda Inc

Summary

We recently changed how we organize and connect Dask’s documentation.Our approach may prove useful for other umbrella projects that spreaddocumentation across many different builds and sites.

Dask splits documentation into many pages

Dask’s documentation is split into several different websites, each managed bya different team for a different sub-project:

  1. dask.pydata.org : Main site
  2. distributed.readthedocs.org : Distributed scheduler
  3. dask-ml.readthedocs.io : Dask for machine learning
  4. dask-kubernetes.readthedocs.io : Dask on Kubernetes
  5. dask-jobqueue.readthedocs.io : Dask on HPC systems
  6. dask-yarn.readthedocs.io : Dask on Hadoop systems
  7. dask-examples.readthedocs.io : Examples that use Dask
  8. matthewrocklin.com/blog,jcrist.github.io,tomaugspurger.github.io,martindurant.github.io/blog :Developers’ personal blogs

This split in documentation matches the split in development teams. Each ofsub-project’s team manages its own docs in its own way. They release at theirown pace and make their own decisions about technology. This makes it muchmore likely that developers maintain the documentation as they develop andchange software libraries.

We make it easy to write documentation. This choice causes many different documentation systems to emerge.

This approach is common. A web search for Jupyter Documentation yields thefollowing list:

Different teams developing semi-independently create different web pages. Thisis inevitable. Asking a large distributed team to coordinate on a singlecohesive website adds substantial friction, which results in worsedocumentation coverage.

Problem

However, while using separate websites results in excellent coverage, italso fragments the documentation. This makes it harder for users to smoothlynavigate between sites and discover appropriate content.

Monolithic documentation is good for readers,modular documentation is good for writers.

Our Solutions

Over the last month we took steps to connect our documentation and make it morecohesive, while still enabling independent development. This post outlines thefollowing steps:

  1. Organize under a single domain, dask.org
  2. Develop a sphinx template project for uniform style
  3. Include a cross-project navbar in addition to the within-projecttable-of-contents

We did some other things along the way that we find useful, but are probablymore specific to just Dask.

  1. We moved this blog to blog.dask.org
  2. We improved our example notebooks to host both a static site and also a live Binder

1: Organize under a single domains, Dask.org

Previously we had some documentation under readthedocs,some under the dask.pydata.org subdomain (thanksNumFOCUS!) and some pages on personal websites, likematthewrocklin.com/blog.

While looking for a new dask domain to host all of our content we noticed thatdask.org redirected toanaconda.org, and were pleased to learn that someone atAnaconda Inc had the foresight to register the domainearly on.

Anaconda was happy to transfer ownership of the domain to NumFOCUS, who helpsus to maintain it now. Now all of our documentation is available under thatsingle domain as subdomains:

This uniformity means that the thing you want is probably at that-thing.dask.org, which is a bit easier to guess than otherwise.

Many thanks to Andy Terrel and TomAugspurger for managing this move, and toAnaconda for generously donating the domain.

2: Cross-project Navigation Bar

We wanted a way for readers to quickly discover the other sites that wereavailable to them. All of our sites have side-navigation-bars to help readersnavigate within a particular site, but now they also have a top-navigation-barto help them navigate between projects.

adding a navbar to dask docs

This navigation bar is managed independently from all of the documentation projects atour new Sphinx theme.

3: Dask Sphinx Theme

To give a uniform sense of style we developed our own Sphinx HTML theme. Thisinherits from ReadTheDocs’ theme, but with changed styling to match Dask colorand visual style. We publish this theme as a package onPyPI that all of our projects’Sphinx builds can import and use if they want. We can change style in this onepackage and publish to PyPI and all of the projects will pick up those changeson their next build without having to copy stylesheets around to differentrepositories.

This allows several different projects to evolve content (which they careabout) and build process separately from style (which they typically don’t careas much about). We have a single style sheet that gets used everywhere easily.

4: Move Dask Blogging to blog.dask.org

Previously most announcements about Dask were written and published from one ofthe maintainers’ personal blogs. This split information about the project andmade it hard for people to discover good content. There also wasn’t a good wayfor a community member to suggest a blog for distribution to the generalcommunity, other than by starting their own.

Now we have an official blog at blog.dask.org whichserves files submitted togithub.com/dask/dask-blog. These postsare simple markdown files that should be easy for people to generate. Forexample the source for this post is available atgithub.com/dask/dask-blog/blob/gh-pages/_posts/2018-09-27-docs-refactor.md

We encourage community members to share posts about work they’ve done with Daskby submitting pull requests to that repository.

5: Host Examples as both static HTML and live Binder sessions

The Dask community maintains a set of example notebooks that show people how touse Dask in a variety of ways. These notebooks live atgithub.com/dask/dask-examples and areeasy for users to download and run.

To get more value from these notebooks we now expose them in two additionalways:

  1. As static HTML at examples.dask.org, renderedwith the nbsphinx plugin.
  2. Seeing them statically rendered and being able to quickly navigate betweenthem really increases the pleasure of exploring them. We hope that thisencourages users to explore more broadly.
  3. As live-runnable notebooks on the cloud using mybinder.org.You can play with any of these notebooks by clicking on this button:
Binder
  1. .
  2. This allows people to explore more deeply. Also, because we’ve connectedup the Dask JupyterLab extension to this environment, users get animmediate instinctual experience of what parallel computing feels like (ifyou haven’t used the dask dashboard during computation you really shouldgive that link a try).

Now that these examples get much more exposure we hope that this encouragescommunity members to submit new examples. We hope that by providinginfrastructure more content creators will come as well.

We also encourage other projects to take a look at what we’ve done ingithub.com/dask/dask-examples. Wethink that this model might be broadly useful across other projects.

Conclusion

Thank you for reading. We hope that this post pushes readers to re-exploreDask’s documentation, and that it pushes developers to consider some of theapproaches above for their own projects.