Showing posts tagged Recurse
Yesterday (July 11, 2021) was the 10 year anniversary of starting at the Mozilla Corporation. My life has changed a ton in those years: in that time I ended a marriage, changed the city in which I live two times, and took up religion1. Mozilla has also changed pretty drastically in my time here, especially in the last year.
Yet somehow I’m still at it, for more or less for the same reasons that led me to accept my initial offer to join the A-team.2 The Internet has the immense potential to be a force for individual empowerment and yet more than ever, we see this technology used to consolidate unchecked power, spread misinformation, and generally exploit people. Mozilla is not perfect (no organization is: 10 years anywhere will teach you that), but it’s one of the few remaining counter-forces to these accelerating trends. While I’m currently taking a bit of a break to explore some stuff on my own, I am looking forward to getting back to work on the mission when I return in mid-August.
Entering the second week of Recurse. Besides orientation and a few adventures in pair programming (special shout out to Jane Adams for trying out Irydium with me!), I spent most of my time attempting to get document saving & loading working with Irydium.
I learned from Iodide that not having a good document sharing story really inhibits collaboration and sharing, which is something I explicitly want to do here at the Recurse centre (and in general for this project). That said, this isn’t actually an area I want to spend a lot of time on right now: it’s the shape of problem I’ve solved many times before (and that has been solved by many others). I’d rather spend my time over the next few weeks on things I haven’t had much of a chance to look at or pursue in my day-to-day.
So, to try to keep the complexity down, I decided to take the same approach as the svelte repl, which aims only to allow the reproduction of simple examples. It allows you to save anything you type in it and also browse anything that you had previously saved. That’s not going to replace GitHub, but it’s more than enough to get started.
So with that goal in mind, how to do go about it? If I wanted to completely fall back on my previous knowledge, I could have gone for the tried + true approach of Django / Heroku to add a persistence layer (what I did for Iodide). That would have had the benefit of being familiar but would also have increased the overall implementation complexity of Irydium considerably. In the past year, I’ve become convinced that serverless approaches to building web applications are the wave of the future, at least for applications like this one. They’re easier to set up, easier to develop, and (generally speaking) cheaper to deploy. Just before I launched, I set up irydium.dev as a static site on Netlify and it’s been a great experience: deploys are super fast and it’s easy to reason about what’s going on “under the hood” (since there’s not a much of a hood to look under).
The naive model for integrating with Supabase is pretty simple:
- Set up a Supabase application, which provides you with a unique API endpoint to make web requests (this endpoint can be exposed publicly).
- Have your client authenticate with an OAuth provider (e.g. GitHub, GitLab), then store an authentication token in localStorage.
- You can then make requests to the above endpoint with the authentication token, which lets Supabase use row-level security to restrict modifications to the database: in this case, we can restrict users to updating their own documents.
I’d say it probably took me 20–30 hours to get the feature working end-to-end (including documentation), which wasn’t too bad. My impressions were pretty positive: the aforementioned tutorial is pretty decent, the supabase-js library provides a nice ORM-like abstraction over SQL and integrates nicely with Svelte. In general working with Supabase felt pretty familiar to me from previous experiences writing database-backed applications, which I take as a very good sign.
The part that felt the weirdest was writing raw SQL to set up the “documents” table that Irydium uses: SQL is something I’m fairly used to writing because of my experiences at Mozilla, but I imagine this might be off-putting to someone newer to writing these types of things. Also, I have some concerns of how maintainable a Supabase database is over the long term: while it was easy enough to document the currently-simple setup instructions in the README, I do somewhat fear the prospect of managing my database via their SQL console. Something like Django’s schema migrations and management commands would be a welcome addition to Supabase’s SDK.
The above approach isn’t what most people would consider to be “best practice”1. In particular, storing credentials in localStorage is probably not the best idea for an application presenting interactive content like Irydium: it wouldn’t be particularly difficult for a malicious document to steal someone’s secret and send it somewhere it shouldn’t be.
I’m not so worried about it at this stage of the project, but one intriguing possibility here (that’s compatible with our current deploy set up) would be to write some simple Netlify Functions to do the actual interaction with Supabase, while delegating to Netlify for the authentication itself (using Netlify Identity).
I experimented writing a simple function to prove out this approach and it seems to work quite well (source, example). This particular function is making an anonymous query to the database, but I see no obstacle to handling authenticated ones as well. Having an API under a
.netlify namespace seems kinda weird on first blush, but I can probably get used to it.
I want to move on to other things now (parsers! document state visualizations!) but might poke at this more later. In the mean time, if you write/build something cool at irydium.dev/repl, let me know!
So it’s my first day at the Recurse centre, which I blogged briefly about last week. I thought I’d start out by going into a bit more detail about what I’m trying to do with Irydium. This post might be a bit discursive and some of my thoughts are only half-formed: my intent here is towards trying to express some of these ideas at all rather than to come up with the perfect formulation for them, which is going to take time. It is based partly on a presentation I gave at Mozilla last Friday (just before going on my 6-week leave, which starts today).
The premise of Irydium is that despite obvious advances in terms of the ability of computers to crunch numbers and analyze data, our ability to share whatever we learn from these understandings is still far too difficult, especially for people new to the field. Even for domain experts (those with the job title “Data Engineer” or “Data Scientist” or similar) this is still more difficult than one would like.
I’ve made a few observations over the past couple years of trying to explain and document Mozilla’s data platform that I think form a good starting point for trying to close the gap:
- Text is pretty great. Writing, just plain text, is (in my opinion) the single best medium for giving context to data. In terms of raw information density and ability to communicate complex ideas, nothing beats it. If you haven’t read it before, the essay always bet on text (by Graydon Hoare, creator of Rust) is well worth reading.
- Markdown is pretty great too. Essentially an easy-to-write superset of HTML, it’s become the medium of choice for many desktop publishing workflows and has become the basis for many efforts in the “interactive presentation” space that I’m most interested in.
- Reactive Systems make Data Exposition Exposition Easier. A reactive abstraction in front of your computational model reduces development times, makes your work more reproducible and is often easier for less-experienced people to understand. I’d cite the long-standing success of Excel and the recent interest in projects like Observable as evidence for this.
Ok, so what is Irydium?
Irydium is, at heart, a way to translate markdown documents into an interactive, compelling visual presentation.
My view is that publishing markdown text on the web is very close to a solved problem, and that we should build on that success rather than invent something new. This is not necessarily a new point of view (e.g. Rmarkdown and JupyterBook have similar premises) but I think some aspects of Irydium’s approach are mildly novel (or at least within the space of “not generally accepted ideas”).
If you want to get a bit of a flavor for how it works, visit the demonstration site (irydium.dev) and play with some of the examples.
What makes Irydium different from <X>?
While there are a bunch of related projects in this space, there’s a few design principles about Irydium that make it a little different from most of what’s already out there1:
- Reactive: Irydium is reactive in the same way that a spreadsheet is — that is, any individual change you make will immediately flow to the rest of the system. This provides a more intuitive model for the creator of the document and also makes it easier to create truly interactive visualizations.
- Idempotent: in Irydium, a source document will yield the same presentation every time it’s run. There’s no need to reason about what the state of the “kernel” is. This is a highly valuable property when thinking about how to make your analyses reproducible.
- Familiar: Irydium uses as few novel concepts and technologies as possible: it builds on some of the best ideas and technologies produced by the open source community: Python, pyodide, Svelte, mdsvex, MyST and a few others — chosen for having a reasonably shallow learning curve.
- Hackable: While I’m working on an online environment to build and share irydium documents, it’s also fully possible to do so using the tools you know and love like Visual Studio Code.
With the above caveats, there are still a number of projects that overlap with Irydium’s ideas and/or design goals. A few that seem worth mentioning here:
- Iodide: This is the obvious one, at least for those who have been following my work for a while. Iodide was an experiment in making a “web native” version of a scientific notebook: it uses the cell-based computational model that will be familiar to anyone who’s used Jupyter, but all the computation happens on the client. It is probably most famous for launching pyodide, a port of Python to WebAssembly (that Irydium now uses to support Python). I feel like it has a number of design issues (some of which I’ve blogged about previously) and is not currently in active development.
- Observable: Client-side reactive notebooks, commercial backing, broadly used in the D3 community. Shares Irydium’s reactive approach, departs from it in terms of using a custom file format and emphasizing their interactive editing and collaboration environment (which is indeed quite impressive). I’ve used Observable for a few small work things (example) and while there’s a lot I like about it, I am a bit non-plussed by how many wheels it reinvents and the implicit lock-in to a single vendor.2
- Starboard: Similar in some ways to Iodide, but in active development. I’ve started chatting a bit with the core developers on whether there might be areas we could collaborate.
- Ellx: I found out a bit about this relatively recently, via the Svelte discord. Actually very close in some ways to Irydium in terms of choices of technology (e.g. Svelte). Again, in initial chats with the core developers on possible collaborations.
My intent with Irydium, at this point in its development, is to prove out some concepts and see where they lead. While I’d welcome it if Irydium became a successful, widely adopted environment for building interactive data visualizations, I’d also be totally happy with other outcomes, such as:
- Providing a source of ideas and/or code for other people.
- Working on (or with) Irydium being a good learning experience both for myself and others
Approaching my 10-year moz-iversary in July, I’ve decided it’s time to take a bit of a mini-sabbatical: I’ll be out (and trying as hard as possible not to check bugmail) from Friday, June 25th until August 9th. During this time, I’ll be doing a batch at the Recurse Centre (something like a writer’s retreat for programmers), exploring some of my interests around data visualization and analysis that don’t quite fit into my role as a Data Engineer here at Mozilla.
In particular, I’m planning to work a bunch on a project tentatively called “Irydium”, which pursues some of the ideas I sketched out last year in my Iodide retrospective and a few more besides. I’ve been steadily working on it in my off hours, but it’s become clear that some of the things I want to pursue would benefit from more dedicated attention and the broader perspective that I’m hoping the Recurse community will be able to provide.
I had meant to write up a proper blog post to announce the project before I left, but it looks like I’m pretty much out of time. Instead, I’ll just offer up the examples on the newly-minted irydium.dev and invite people to contact me if any of the ideas on the site sounds interesting. I’m hoping to blog a whole bunch while I’m there, but probably not under the Mozilla tag. Feel free to add wrla.ch to your RSS feed if you want to follow what I’m up to!