Showing posts tagged Voltus

Using Sphinx in a Monorepo

Sep 25th, 2022

Sphinx Documentation Voltus Mozilla

Just wanted to type up a couple of notes about working with Sphinx (the python documentation generator) inside a monorepo, an issue I’ve been struggling with (off and on) at Voltus since I started. I haven’t seen much written about this topic despite (I suspect) it being a reasonably frequent problem.

In general, there’s a lot to like about Sphinx: it’s great at handling deeply nested trees of detailed documentation with cross-references inside a version control system. It has local search that works pretty well and some themes (like readthedocs) scale pretty nicely to hundreds of documents. The directives and roles system is pretty flexible and covers most of the common things one might want to express in technical documentation. And if the built-in set of functionality isn’t enough, there’s a wealth of third party extension modules. My only major complaint is that it uses the somewhat obscure restructuredText file format by default, but you can get around that by using the excellent MyST extension.

Unfortunately, it has a pretty deeply baked in assumption that all documentation for your project lives inside a single subfolder. This is fine for a small repository representing a single python module, like this:

<root>
README.md
setup.cfg
pyproject.toml
mymodule/
docs/

However, this doesn’t work for a large monorepo, where you would typically see something like:

<root>/module-1/submodule-a
<root>/module-1/submodule-b
<root>/module-2/submodule-c
...

In a monorepo, you usually want to include a module’s documentation inside its own directory. This allows you to use your code ownership constraints for documentation, among other things.

The naive solution would be to create a sphinx site for every single one of these submodules. This is what happened at Voltus and I don’t recommend it. For a large monorepo you’ll end up with dozens, maybe hundreds of documentation “sites”. Under this scenario, discoverability becomes a huge problem: no longer can you rely on tables of contents and the built-in search to discover content: you just have to “know” where things live. I’m more than nine months in here and I’m still discovering new documentation.

It would be much better if we could somehow collect documentation from other parts of the repository into a single site. Is this possible? tl;dr: Yes. There’s a few solutions, each with their pros and cons.

The obvious solution that doesn’t work

The most obvious solution here is to create a symbolic link inside your documentation directory, say the following:

<root>/docs/
<root>/docs/module-1/submodule-a -> <root>/module-1/submodule-a/docs

Unfortunately, this doesn’t work. ☹️ Sphinx doesn’t follow symbolic links.

Solution 1: Just copy the files in

The most obvious solution is to just copy the files from various parts of the monorepo into place, as part of the build system. Mozilla did this for Firefox, with the moztreedocs system.

The results look pretty good, but this is a bespoke solution. Aside from general ideas, there’s no way I’m going to be able to apply anything in moztreedocs to Voltus’s monorepo (which is based on a completely different build system). And being honest, I’m not sure if the 40+ hour (estimated) effort to reimplement it would be a good use of time compared to other things I could be doing.

Solution 2: Use the include directive with MyST

Later versions of MyST include support for directly importing a markdown file from another part of the repository.

This is a limited form of embedding: it won’t let you import an entire directory of markdown files. But if your submodules mostly just include content in the form of a README.md (or similar), it might just be enough. Just create a directory for these files to live (say services) and slot them in:

<root>/docs/services/module-1/submodule-a/index.md:

```{include} ../../../module-1/submodule-a/README.md
```

I’m currently in the process of implementing this solution inside Voltus. I have optimism that this will be a big (if incremental) step up over what we have right now. There are obviously limits, but you can cram a lot of useful information in a README. As a bonus, it’s a pretty nice marker for those spelunking through the source code (much more so than a forest of tiny documentation files).

Solution 3: Sphinx Collections

This one I just found about today: Sphinx Collections is a small python module that lets you automatically import entire directories of files into your sphinx tree, under a _collections module. You configure it in your top-level conf.py like this:

extensions = [
    ...
    "sphinxcontrib.collections"
]

collections = {
    "submodule-a": {
        "driver": "symlink",
        "source": "/monorepo/module-1/submodule-a/docs",
        "target": "submodule-a"
    },
    ...
}

After setting this up, submodule-a is now available under _collections and you can include it in your table of contents like this:

...

```{toctree}
:caption: submodule-a

_collections/submodule-a/index.md
```

...

At this point, submodule-a’s documentation should be available under http://<my doc domain>/_collections/submodule-a/index.html

Pretty nifty. The main downside I’ve found so far is that this doesn’t play nicely with the Edit on GitHub links that the readthedocs theme automatically inserts (it thinks the files exist under _collections), but there’s probably a way to work around that.

I plan on investigating this approach further in the coming months.


90 days out and in

Apr 16th, 2022

Mozilla Voltus Recurse

The 90 day mark just passed at my new gig at Voltus, feels like a good time for a bit of self-reflection.

In general, I think it’s been a good change and that it was the right time to leave Mozilla. Since I left, a few people have asked me why I chose to do so: while the full answer is pretty complicated (these things are never simple!), I think it does ultimately come down to wanting to try something new after 10+ years. I’ve accumulated a fair amount of expertise in web development and data engineering and I wanted to see if I could apply them to a new area that I cared about— in this case, climate change and the energy transition.

Voltus is a much younger and different company than Mozilla was, and there’s no shortage of things to learn and do. Energy markets are a rather interesting technical domain to work in— a big intersection between politics, technology, and business. Lots of very old and very new things all at once. As a still-relatively young company, there is definitely more of a feeling that it’s possible to shape Voltus’s culture and practices, which has been interesting. There’s a bit of a balancing act between sharing what you’ve learned in previous roles while having the humility to recognize that there’s much you still don’t understand in a new workplace.

On the downside, I have to admit that I do miss being able to work in the open. Voltus is currently in the process of going public, which has made me extra shy about saying much of anything about what I’ve been working on in a public forum.

To some extent I’ve been scratching this itch by continuing to work on Irydium when I have the chance. I’ve done up a few new releases in the last couple of months, which I think have been fairly well received inside my very small community of people doing like-minded things. I’m planning on attending (at least part of) a pyodide sprint in early May, which I think should be a lot of fun as well as an opportunity to push browser-based data science forward.

I’ve also kept more of a connection with Mozilla than I thought I would have: some video meetings with former colleagues, answering questions on Element (chat.mozilla.org), even some pull requests where I felt like I could make a quick contribution. I’m still using Firefox, which has actually given me more perspective on some problems that people at Mozilla might not experience (e.g. this screensharing bug which you’d only see if you’re using a WebRTC-based video conferencing solution like Google Meet).

That said, I’m not sure to what extent this will continue: even if the source code to Firefox and the tooling that supports it is technically “open source”, outsiders like myself really have very limited visibility into what Mozilla is doing these days. This makes it difficult to really connect with much of what’s going on or know how I might be able to contribute. While it might be theoretically possible to join Mozilla’s Slack (at least last I checked), that feels like a rabbit hole I’d prefer not to go down. While I’m still interested in supporting Mozilla’s mission, I really don’t want more than one workplace chat tool in my life: there’s a lot of content there that is no longer relevant to me as a non-employee and (being honest) I’d rather leave behind. There’s lots more I could say about this, but probably best to leave it there: I understand that there’s reasons why things are the way they are, even if they make me a little sad.