This work is supported by Anaconda Inc
We recently changed how we organize and connect Dask’s documentation.Our approach may prove useful for other umbrella projects that spreaddocumentation across many different builds and sites.
Dask’s documentation is split into several different websites, each managed bya different team for a different sub-project:
This split in documentation matches the split in development teams. Each ofsub-project’s team manages its own docs in its own way. They release at theirown pace and make their own decisions about technology. This makes it muchmore likely that developers maintain the documentation as they develop andchange software libraries.
We make it easy to write documentation. This choice causes many different documentation systems to emerge.
This approach is common. A web search for Jupyter Documentation yields thefollowing list:
Different teams developing semi-independently create different web pages. Thisis inevitable. Asking a large distributed team to coordinate on a singlecohesive website adds substantial friction, which results in worsedocumentation coverage.
However, while using separate websites results in excellent coverage, italso fragments the documentation. This makes it harder for users to smoothlynavigate between sites and discover appropriate content.
Monolithic documentation is good for readers,modular documentation is good for writers.
Over the last month we took steps to connect our documentation and make it morecohesive, while still enabling independent development. This post outlines thefollowing steps:
We did some other things along the way that we find useful, but are probablymore specific to just Dask.
Previously we had some documentation under readthedocs,some under the dask.pydata.org subdomain (thanksNumFOCUS!) and some pages on personal websites, likematthewrocklin.com/blog.
While looking for a new dask domain to host all of our content we noticed thatdask.org redirected toanaconda.org, and were pleased to learn that someone atAnaconda Inc had the foresight to register the domainearly on.
Anaconda was happy to transfer ownership of the domain to NumFOCUS, who helpsus to maintain it now. Now all of our documentation is available under thatsingle domain as subdomains:
This uniformity means that the thing you want is probably at that-thing.dask.org, which is a bit easier to guess than otherwise.
Many thanks to Andy Terrel and TomAugspurger for managing this move, and toAnaconda for generously donating the domain.
We wanted a way for readers to quickly discover the other sites that wereavailable to them. All of our sites have side-navigation-bars to help readersnavigate within a particular site, but now they also have a top-navigation-barto help them navigate between projects.
This navigation bar is managed independently from all of the documentation projects atour new Sphinx theme.
To give a uniform sense of style we developed our own Sphinx HTML theme. Thisinherits from ReadTheDocs’ theme, but with changed styling to match Dask colorand visual style. We publish this theme as a package onPyPI that all of our projects’Sphinx builds can import and use if they want. We can change style in this onepackage and publish to PyPI and all of the projects will pick up those changeson their next build without having to copy stylesheets around to differentrepositories.
This allows several different projects to evolve content (which they careabout) and build process separately from style (which they typically don’t careas much about). We have a single style sheet that gets used everywhere easily.
Previously most announcements about Dask were written and published from one ofthe maintainers’ personal blogs. This split information about the project andmade it hard for people to discover good content. There also wasn’t a good wayfor a community member to suggest a blog for distribution to the generalcommunity, other than by starting their own.
Now we have an official blog at blog.dask.org whichserves files submitted togithub.com/dask/dask-blog. These postsare simple markdown files that should be easy for people to generate. Forexample the source for this post is available atgithub.com/dask/dask-blog/blob/gh-pages/_posts/2018-09-27-docs-refactor.md
We encourage community members to share posts about work they’ve done with Daskby submitting pull requests to that repository.
The Dask community maintains a set of example notebooks that show people how touse Dask in a variety of ways. These notebooks live atgithub.com/dask/dask-examples and areeasy for users to download and run.
To get more value from these notebooks we now expose them in two additionalways:
Now that these examples get much more exposure we hope that this encouragescommunity members to submit new examples. We hope that by providinginfrastructure more content creators will come as well.
We also encourage other projects to take a look at what we’ve done ingithub.com/dask/dask-examples. Wethink that this model might be broadly useful across other projects.
Thank you for reading. We hope that this post pushes readers to re-exploreDask’s documentation, and that it pushes developers to consider some of theapproaches above for their own projects.