Showing posts tagged Irydium
One of my main goals with Irydium is to allow it to be a part of as many data science and engineering workflows as possible (including ones I haven’t thought of). Yes, like Iodide and other products, I am (slowly) building a web-based interface for building and sharing dashboards, reports, and similar things. However, I also want to fully support local and command-line based workflows. Beyond the obvious utility of being able to use your favorite text-editor to create documents, this also opens up the possibility of combining Irydium with other tools and workflows. For a slightly longer exposition on why this is desirable, I would highly recommend reading Ryan Harter’s post on the subject: Don’t make me code in your text box.
Using the irydium template
To make getting started easier, I just created an irydium-template: a simple GitHub repository which contains a minimal markdown document (a big mac index visualization) which you can use as a base, as well as a bit of npm scaffolding to get you up and running quickly. To check it out via the console, I recommend using degit (the tool of choice for such things in the Svelte community):
npx degit firstname.lastname@example.org:irydium/irydium-template.git my-notebook
npm run dev
This will create a webserver which renders the document (index.md) at port 3000, along with some debugging options. As you edit and save the document, the site should update automatically.
Publishing your work
When you’re happy with the results, you can create a static version of the site (an
index.html file) by running
npm run build. You can publish this via whatever you like: GitHub pages, Netlify / Vercel or… my new favorite service, surge.sh. Surge provides a really simple hosting service for hosting static sites and works great with Irydium. Installing and running it locally is two commands:
npm install -g surge
Surge will prompt you for an email and a password, then will automatically publish your site at a unique URL. As an example, I published a site for the above template: few-blade.surge.sh
Interested in chatting more about this? Feel free to reach out on the Irydium Gitter chat.
Some quick updates on where Irydium is at, roughly a week-and-a-half before my mini-sabbatical at the Recurse Centre ends.
JupyterBook and MyST
I’d been admiring JupyterBook from afar for some time: their project philosophy appealed to me greatly. In particular, the MyST extensions to markdown seemed like a natural fit for this project and a natural point of collaboration and cross-pollination. A couple of weeks ago, I finally got in touch with some people working on that project, which prompted a few small efforts:
I’ve become convinced that building on top of MyST is right for both Irydium and the larger community. Increasing Irydium’s support for MyST is tracked in irydium/irydium#123.
Using Irydium to build Irydium
I’ve been spending a fair bit of time thinking of how to ma ke it easier for people to build Irydium documents through composition of existing documents. Landed the first pieces of this. The first is the ability to “import” a code chunk from another irydium document. There’s a few examples of this in the new components section of irydium.dev:
In a sense this allows you to define a reusable piece of code along with both documentation and usage examples. I think this concept will be particularly useful for supporting language plugins (which I will write about in an upcoming post).
It’s a real project now
I spent a bit of time last week doing some community gardening. I still consider Irydium an “experiment” but I’d like to at least open up the possibility of it being something larger. To help make that happen, I started working on some basic project governance pieces, namely:
- We have a code of conduct and contributing guidelines. I opted to go for the Contributor Covenant, which seems to be a good minimal viable social contract. I considered something proposing something more comprehensive (like the Rust Code of Conduct), but I felt that’s something for a group of people to discuss and debate, should the time come where Irydium is more than a one-person show. For now, I’ll do my best to make sure that everyone in Irydium’s orbit has a good experience.
- There’s a proper issues list, including some “good first bugs” for people to look at (shout out to @m-clare for submitting the first PR to Irydium!)
- We have a channel on gitter, also accessible via Matrix. Come say hi!
There’s not a ton of time left at RC, so some of these things may have to be done in my spare time after the batch ends. That said, here’s my near-term roadmap:
- Add support for code chunks to output content directly to the DOM (currently the only way to output to an Irydium document is through a Svelte component). This will be particularly important for Python support, where people expect the output of a cell running altair or matplotlib to display directly in the document (as they do in Jupyter). Tracked in irydium/irydium#122.
- Finish and document support for language plugins (and make another blog post especially about them, they’re cool!). Tracked in irydium/irydium#144.
Entering the second week of Recurse. Besides orientation and a few adventures in pair programming (special shout out to Jane Adams for trying out Irydium with me!), I spent most of my time attempting to get document saving & loading working with Irydium.
I learned from Iodide that not having a good document sharing story really inhibits collaboration and sharing, which is something I explicitly want to do here at the Recurse centre (and in general for this project). That said, this isn’t actually an area I want to spend a lot of time on right now: it’s the shape of problem I’ve solved many times before (and that has been solved by many others). I’d rather spend my time over the next few weeks on things I haven’t had much of a chance to look at or pursue in my day-to-day.
So, to try to keep the complexity down, I decided to take the same approach as the svelte repl, which aims only to allow the reproduction of simple examples. It allows you to save anything you type in it and also browse anything that you had previously saved. That’s not going to replace GitHub, but it’s more than enough to get started.
So with that goal in mind, how to do go about it? If I wanted to completely fall back on my previous knowledge, I could have gone for the tried + true approach of Django / Heroku to add a persistence layer (what I did for Iodide). That would have had the benefit of being familiar but would also have increased the overall implementation complexity of Irydium considerably. In the past year, I’ve become convinced that serverless approaches to building web applications are the wave of the future, at least for applications like this one. They’re easier to set up, easier to develop, and (generally speaking) cheaper to deploy. Just before I launched, I set up irydium.dev as a static site on Netlify and it’s been a great experience: deploys are super fast and it’s easy to reason about what’s going on “under the hood” (since there’s not a much of a hood to look under).
The naive model for integrating with Supabase is pretty simple:
- Set up a Supabase application, which provides you with a unique API endpoint to make web requests (this endpoint can be exposed publicly).
- Have your client authenticate with an OAuth provider (e.g. GitHub, GitLab), then store an authentication token in localStorage.
- You can then make requests to the above endpoint with the authentication token, which lets Supabase use row-level security to restrict modifications to the database: in this case, we can restrict users to updating their own documents.
I’d say it probably took me 20–30 hours to get the feature working end-to-end (including documentation), which wasn’t too bad. My impressions were pretty positive: the aforementioned tutorial is pretty decent, the supabase-js library provides a nice ORM-like abstraction over SQL and integrates nicely with Svelte. In general working with Supabase felt pretty familiar to me from previous experiences writing database-backed applications, which I take as a very good sign.
The part that felt the weirdest was writing raw SQL to set up the “documents” table that Irydium uses: SQL is something I’m fairly used to writing because of my experiences at Mozilla, but I imagine this might be off-putting to someone newer to writing these types of things. Also, I have some concerns of how maintainable a Supabase database is over the long term: while it was easy enough to document the currently-simple setup instructions in the README, I do somewhat fear the prospect of managing my database via their SQL console. Something like Django’s schema migrations and management commands would be a welcome addition to Supabase’s SDK.
The above approach isn’t what most people would consider to be “best practice”1. In particular, storing credentials in localStorage is probably not the best idea for an application presenting interactive content like Irydium: it wouldn’t be particularly difficult for a malicious document to steal someone’s secret and send it somewhere it shouldn’t be.
I’m not so worried about it at this stage of the project, but one intriguing possibility here (that’s compatible with our current deploy set up) would be to write some simple Netlify Functions to do the actual interaction with Supabase, while delegating to Netlify for the authentication itself (using Netlify Identity).
I experimented writing a simple function to prove out this approach and it seems to work quite well (source, example). This particular function is making an anonymous query to the database, but I see no obstacle to handling authenticated ones as well. Having an API under a
.netlify namespace seems kinda weird on first blush, but I can probably get used to it.
I want to move on to other things now (parsers! document state visualizations!) but might poke at this more later. In the mean time, if you write/build something cool at irydium.dev/repl, let me know!
So it’s my first day at the Recurse centre, which I blogged briefly about last week. I thought I’d start out by going into a bit more detail about what I’m trying to do with Irydium. This post might be a bit discursive and some of my thoughts are only half-formed: my intent here is towards trying to express some of these ideas at all rather than to come up with the perfect formulation for them, which is going to take time. It is based partly on a presentation I gave at Mozilla last Friday (just before going on my 6-week leave, which starts today).
The premise of Irydium is that despite obvious advances in terms of the ability of computers to crunch numbers and analyze data, our ability to share whatever we learn from these understandings is still far too difficult, especially for people new to the field. Even for domain experts (those with the job title “Data Engineer” or “Data Scientist” or similar) this is still more difficult than one would like.
I’ve made a few observations over the past couple years of trying to explain and document Mozilla’s data platform that I think form a good starting point for trying to close the gap:
- Text is pretty great. Writing, just plain text, is (in my opinion) the single best medium for giving context to data. In terms of raw information density and ability to communicate complex ideas, nothing beats it. If you haven’t read it before, the essay always bet on text (by Graydon Hoare, creator of Rust) is well worth reading.
- Markdown is pretty great too. Essentially an easy-to-write superset of HTML, it’s become the medium of choice for many desktop publishing workflows and has become the basis for many efforts in the “interactive presentation” space that I’m most interested in.
- Reactive Systems make Data Exposition Exposition Easier. A reactive abstraction in front of your computational model reduces development times, makes your work more reproducible and is often easier for less-experienced people to understand. I’d cite the long-standing success of Excel and the recent interest in projects like Observable as evidence for this.
Ok, so what is Irydium?
Irydium is, at heart, a way to translate markdown documents into an interactive, compelling visual presentation.
My view is that publishing markdown text on the web is very close to a solved problem, and that we should build on that success rather than invent something new. This is not necessarily a new point of view (e.g. Rmarkdown and JupyterBook have similar premises) but I think some aspects of Irydium’s approach are mildly novel (or at least within the space of “not generally accepted ideas”).
If you want to get a bit of a flavor for how it works, visit the demonstration site (irydium.dev) and play with some of the examples.
What makes Irydium different from <X>?
While there are a bunch of related projects in this space, there’s a few design principles about Irydium that make it a little different from most of what’s already out there1:
- Reactive: Irydium is reactive in the same way that a spreadsheet is — that is, any individual change you make will immediately flow to the rest of the system. This provides a more intuitive model for the creator of the document and also makes it easier to create truly interactive visualizations.
- Idempotent: in Irydium, a source document will yield the same presentation every time it’s run. There’s no need to reason about what the state of the “kernel” is. This is a highly valuable property when thinking about how to make your analyses reproducible.
- Familiar: Irydium uses as few novel concepts and technologies as possible: it builds on some of the best ideas and technologies produced by the open source community: Python, pyodide, Svelte, mdsvex, MyST and a few others — chosen for having a reasonably shallow learning curve.
- Hackable: While I’m working on an online environment to build and share irydium documents, it’s also fully possible to do so using the tools you know and love like Visual Studio Code.
With the above caveats, there are still a number of projects that overlap with Irydium’s ideas and/or design goals. A few that seem worth mentioning here:
- Iodide: This is the obvious one, at least for those who have been following my work for a while. Iodide was an experiment in making a “web native” version of a scientific notebook: it uses the cell-based computational model that will be familiar to anyone who’s used Jupyter, but all the computation happens on the client. It is probably most famous for launching pyodide, a port of Python to WebAssembly (that Irydium now uses to support Python). I feel like it has a number of design issues (some of which I’ve blogged about previously) and is not currently in active development.
- Observable: Client-side reactive notebooks, commercial backing, broadly used in the D3 community. Shares Irydium’s reactive approach, departs from it in terms of using a custom file format and emphasizing their interactive editing and collaboration environment (which is indeed quite impressive). I’ve used Observable for a few small work things (example) and while there’s a lot I like about it, I am a bit non-plussed by how many wheels it reinvents and the implicit lock-in to a single vendor.2
- Starboard: Similar in some ways to Iodide, but in active development. I’ve started chatting a bit with the core developers on whether there might be areas we could collaborate.
- Ellx: I found out a bit about this relatively recently, via the Svelte discord. Actually very close in some ways to Irydium in terms of choices of technology (e.g. Svelte). Again, in initial chats with the core developers on possible collaborations.
My intent with Irydium, at this point in its development, is to prove out some concepts and see where they lead. While I’d welcome it if Irydium became a successful, widely adopted environment for building interactive data visualizations, I’d also be totally happy with other outcomes, such as:
- Providing a source of ideas and/or code for other people.
- Working on (or with) Irydium being a good learning experience both for myself and others
Approaching my 10-year moz-iversary in July, I’ve decided it’s time to take a bit of a mini-sabbatical: I’ll be out (and trying as hard as possible not to check bugmail) from Friday, June 25th until August 9th. During this time, I’ll be doing a batch at the Recurse Centre (something like a writer’s retreat for programmers), exploring some of my interests around data visualization and analysis that don’t quite fit into my role as a Data Engineer here at Mozilla.
In particular, I’m planning to work a bunch on a project tentatively called “Irydium”, which pursues some of the ideas I sketched out last year in my Iodide retrospective and a few more besides. I’ve been steadily working on it in my off hours, but it’s become clear that some of the things I want to pursue would benefit from more dedicated attention and the broader perspective that I’m hoping the Recurse community will be able to provide.
I had meant to write up a proper blog post to announce the project before I left, but it looks like I’m pretty much out of time. Instead, I’ll just offer up the examples on the newly-minted irydium.dev and invite people to contact me if any of the ideas on the site sounds interesting. I’m hoping to blog a whole bunch while I’m there, but probably not under the Mozilla tag. Feel free to add wrla.ch to your RSS feed if you want to follow what I’m up to!