Category Archives: Technical Entries

PyCon 2014 impressions: ipython notebook is the future & more

This year’s PyCon US (Python Conference) was in my city of residence (Montréal) so I took the opportunity to go and see what was up in the world of the language I use the most at Mozilla. It was pretty great!

ipython

The highlight for me was learning about the possibilities of ipython notebooks, an absolutely fantastic interactive tool for debugging python in a live browser-based environment. I’d heard about it before, but it wasn’t immediately apparent how it would really improve things — it seemed to be just a less convenient interface to the python console that required me to futz around with my web browser. Watching a few presentations on the topic made me realize how wrong I was. It’s already changed the way I do work with Eideticker data, for the better.

Using ipython to analyze some eideticker data
Using ipython to analyze some eideticker data

I think the basic premise is really quite simple: a better interface for typing in, experimenting with, and running python code. If you stop and think about it, the modern web interface supports a much richer vocabulary of interactive concepts that the console (or even text editors like emacs): there’s no reason we shouldn’t take advantage of it.

Here are the (IMO) killer features that make it worth using:

  • The ability to immediately re-execute a block of code after editing and seeing an error (essentially merging the immediacy of the python console with the permanency / cut & pastability of an actual script)
  • Live-printing out graphs of numerical results using matplotlib. ZOMG this is so handy. Especially in conjunction with the live-editing outlined above, there’s no better tool for fine-tuning mathematical/statistical analysis.
  • The shareability of the results. Any ipython notebook can be saved and then saved to a public website. Many presentations at PyCon 2014, in fact, were done entirely with ipython notebooks. So handy for answering questions like “how did you get that”?

To learn more about how to use ipython notebooks for data analysis, I highly recommend Julie Evan’s talk Diving into Open Data with IPython Notebook & Pandas, which you can find on pyvideo.org.

Other Good Talks

I saw some other good talks at the conference, here are some of them:

  • All Your Ducks In A Row: Data Structures in the Standard Library and Beyond – A useful talk by Brandon Rhoades on the implementation of basic data structures in Python, and how to select the ones to use for optimal performance. It turns out that lists aren’t the best thing to use for long sequences of numerical data (who knew?)
  • Fast Python, Slow Python – An interesting talk by Alex Gaynor about how to write decent performing pure-python code in a single-threaded context. Lots of intelligent stuff about producing robust code that matches your intention and data structures, and caution against doing fancy things in the name of being “pythonic” or “general”.
  • Analyzing Rap Lyrics with Python – Another data analysis talk, this one about a subject I knew almost nothing about. The best part of it (for me anyway) was learning how the speaker (Julie Lavoie) narrowed her focus in her research to the exact aspects of the problem that would let her answer the question she was interested in (“Can we automatically find out which rap lyrics are the most sexist?”) as opposed to interesting problems (“how can I design the most general scraping library possible?”) that don’t answer the question. In my opinion, this ability to focus is one of the key things that seperates successful projects from unsuccessful ones.

First Eideticker Responsiveness Tests

[ For more information on the Eideticker software I'm referring to, see this entry ]

Time for another update on Eideticker. In the last quarter, I’ve been working on two main items:

  1. Responsiveness tests (Android / FirefoxOS)
  2. Eideticker for FirefoxOS

The focus of this post is the responsiveness work. I’ll talk about Eideticker for FirefoxOS soon. :)

So what do I mean by responsiveness? At a high-level, I mean how quickly one sees a response after performing an action on the device. For example, if I perform a swipe gesture to scroll the content down while browsing CNN.com, how long does it take after
I start the gesture for the content to visibly scroll down? If you break it down, there’s a multi-step process that happens behind the scenes after a user action like this:

input-events

If anywhere in the steps above, there is a significant delay, the user experience is likely to be bad. Usability research
suggests that any lag that is consistently above 100 milliseconds will lead the user to perceive things as being laggy. To keep our users happy, we need to do our bit to make sure that we respond quickly at all levels that we control (just the application layer on Android, but pretty much everything on FirefoxOS). Even if we can’t complete the work required on our end to completely respond to the user’s desire, we should at least display something to acknowledge that things have changed.

But you can’t improve what you can’t measure. Fortunately, we have the means to do calculate of the time delta between most of the steps above. I learned from Taras Glek this weekend that it should be possible to simulate the actual capacitative touch event on a modern touch screen. We can recognize when the hardware event is available to be consumed by userspace by monitoring the `/dev/input` subsystem. And once the event reaches the application (the Android or FirefoxOS application) there’s no reason we can’t add instrumentation in all sorts of places to track the processing of both the event and the rendering of the response.

My working hypothesis is that it’s application-level latency (i.e. the time between the application receiving the event and being able to act on it) that dominates, so that’s what I decided to measure. This is purely based on intuition and by no means proven, so we should test this (it would certainly be an interesting exercise!). However, even if it turns out that there are significant problems here, we still care about the other bits of the stack — there’s lots of potentially-latency-introducing churn there and the risk of regression in our own code is probably higher than it is elsewhere since it changes so much.

Last year, I wrote up a tool called Orangutan that can directly inject input events into an input device on Android or FirefoxOS. It seemed like a fairly straightforward extension of the tool to output timestamps when these events were registered. It was. Then, by synchronizing the time between the device and the machine doing the capturing, we can then correlate the input timestamps to events. To help visualize what’s going on, I generated this view:

taskjs-framediff-view

[Link to original]

The X axis in that graph represents time. The Y-axis represents the difference between the frame at that time with the previous one in number of pixels. The “red” represents periods in capture when events are ongoing (we use different colours only to
distinguish distinct events). [1]

For a first pass at measuring responsiveness, I decided to measure the time between the first event being initiated and there being a significant frame difference (i.e. an observable response to the action). You can see some preliminary results on the eideticker dashboard:

taskjs-responsiveness

[Link to original]

The results seem pretty highly variable at first because I was synchronizing time between the device and an external ntp server, rather than the host machine. I believe this is now fixed, hopefully giving us results that will indicate when regressions occur. As time goes by, we may want to craft some special eideticker tests for responsiveness specifically (e.g. a site where there is heavy javascript background processing).

[1] Incidentally, these “frame difference” graphs are also quite useful for understanding where and how application startup has regressed in Fennec — try opening these two startup views side-by-side (before/after a large regression) and spot the difference: [1] and [2])

Eideticker for FirefoxOS

[ For more information on the Eideticker software I'm referring to, see this entry ]

Here’s a long overdue update on where we’re at with Eideticker for FirefoxOS. While we’ve had a good amount of success getting useful, actionable data out of Eideticker for Android, so far we haven’t been able to replicate that success for FirefoxOS. This is not for lack of trying: first, Malini Das and then me have been working at it since summer 2012.

When it comes right down to it, instrumenting Eideticker for B2G is just a whole lot more complex. On Android, we could take the operating system (including support for all the things we needed, like HDMI capture) as a given. The only tricky part was instrumenting the capture so the right things happened at the right moment. With FirefoxOS, we need to run these tests on entire builds of a whole operating system which was constantly changing. Not nearly as simple. That said, I’m starting to see light at the end of the tunnel.

Platforms

We initially selected the pandaboard as the main device to use for eideticker testing, for two reasons. First, it’s the same hardware platform we’re targeting for other b2g testing in tbpl (mochitest, reftest, etc.), and is the platform we’re using for running Gaia UI tests. Second, unlike every other device that we’re prototyping FirefoxOS on (to my knowledge), it has HDMI-out capability, so we can directly interface it with the Eideticker video capture setup.

However, the panda also has some serious shortcomings. First, it’s obviously not a platform we’re shipping, so the performance we’re seeing from it is subject to different factors that we might not see with a phone actually shipped to users. For the same reason, we’ve had many problems getting B2G running reliably on it, as it’s not something most developers have been hacking on a day to day basis. Thanks to the heroic efforts of Thomas Zimmerman, we’ve mostly got things working ok now, but it was a fairly long road to get here (several months last fall).

More recently, we became aware of something called an Elmo which might let us combine
the best of both worlds. An elmo is really just a tiny mounted video camera with a bunch of outputs, and is I understand most commonly used to project documents in a classroom/presentation setting. However, it seems to do a great job of capturing mobile phones in action as well. :)

The nice thing about using an external camera for the video capture part of eideticker is that we are no longer limited to devices with HDMI out — we can run the standard set of automated tests on ANYTHING. We’ve already used this to some success in getting some videos of FirefoxOS startup times versus Android on the Unagi (a development phone that we’re using internally) for manual analysis. Automating this process may be trickier because of the fact that the video capture is no longer “perfect”, but we may be able to work around that (more discussion about this later).

FirefoxOS web page tests

These are the same tests we run on Android. They should give us an idea of roughly where our performance when browsing / panning web sites like CNN. So far, I’ve only run these tests on the Pandaboard and they are INCREDIBLY slow (like 1-3 frames per second when scrolling). So much so that I have to think there is something broken about our hardware acceleration on this platform.

FirefoxOS application tests

These are some new tests written in a framework that allows you to script arbitrary interactions in FirefoxOS, like launching applications or opening the task switcher.

I’m pretty happy with this. It seems to work well. The only problems I’m seeing with this is with the platform we’re running these tests on. With the pandaboard, applications look weird (since the screen resolution doesn’t remotely resemble the 320×480 resolution on our current devices) and performance is abysmal. Take, for example, this capture of application switching performance, which operates only at roughly 3-4 fps:

So what now?

I’m not 100% sure yet (partly it will depend on what others say as well as my own investigation), but I have a feeling that capturing video of real devices running FirefoxOS using the Elmo is the way forward. First, the hardware and driver situation will be much more representative of what we’ll actually be shipping to users. Second, we can flash new builds of FirefoxOS onto them automatically, unlike the pandaboards where you currently either need to manually flash and reset (a time consuming and error prone process) or set up an instance of mozpool (which I understand is quite complicated).

The main use case I see with eideticker-on-panda would be where we wanted to run a suite of tests on checkin (in tbpl-like fashion) and we’d need to scale to many devices. While cool, this sounds like an expensive project (both in terms of time and hardware) and I think we’d do better with getting something slightly smaller-scale running first.

So, the real question is whether or not the capture produced by the Elmo is amenable to the same analysis that we do on the raw HDMI output. At the very least, some of eideticker’s image analysis code will have to be adapted to handle a much “noisier” capture. As opposed to capturing the raw HDMI signal, we now have to deal with the real world and its irritating fluctuations in ambient light levels and all that the rest. I have no doubt it is *possible* to compensate for this (after all this is what the human eye/brain does all the time), but the question is how much work it will be. Can’t speak for anyone else at Mozilla, but I’m not sure if I really have the time to start a Ph.D-level research project in computational vision. ;) I’m optimistic that won’t be necessary, but we’ll just have to wait and see.

Mass code relicensing with facebook’s codemod

Recently the Firefox source repository (mozilla-central) was converted over recently to a new license with a lovely short boilerplate. This is great, but here in automation and tools, we have a fairly large number of projects that live outside of the main tree but for which the new license still applies. A few weeks ago, I wanted to move one our projects over (mozbase), but didn’t want to spend hours manually editing text files. I understand that a script was used to convert mozilla-central, but a quick google didn’t turn it up. [ edit: thanks to Ed Morley for pointing out to me that it lives here: http://hg.mozilla.org/users/gerv_mozilla.org/relic/ ]

I surely could have asked about where this script is, but this problem gave me an excuse to try something that I’d been meaning to for a while: Facebook’s codemod. Codemod is a neat little command-line utility which aims to help with mass refactorings of code. All you have to do is provide a few regular expressions to replace, and off it goes. I tried it out with mozbase, and the results were great: 5 minutes spent coming up with a regular expression and jiggering with command line options, and the job was done.

I had the desire to do this again today for Eideticker, and decided to document the (extremely simple) process for posterity. I just used this simple command line…

../codemod/src/codemod.py --extensions py -m '# \*\*\*\*\* BEGIN LICENSE.*END LICENSE BLOCK \*\*\*\*\*' '# This Source Code Form is subject to the terms of the Mozilla Public\n# License, v. 2.0. If a copy of the MPL was not distributed with this file,\n# You can obtain one at http://mozilla.org/MPL/2.0/.'

… et voila, out popped shiny new boilerplate.

Playing with pandas

For the last few days I’ve been experimenting with getting a Pandaboard running Android 4.0, continuing the work that Clint Talbert started in the fall to get these boards for use as a replacement for the Tegra in Mozilla’s android automation. The first objective is to get a reproducible build going, after that we’ll try to get some of our custom tools (SUTAgent & friends) installed by default.

So far this has been… interesting. Much as Clint did before, I thought I’d document some of the notes on what I did in the hopes that they’ll be helpful to other people trying to do similar things.

Getting things up and running is a two step process. First, you build the beast. This part is straightforward, just follow the instructions here:

At least the build part is more or less straightforward. Just follow the instructions here:

Note that you almost certainly want to build in the “eng” configuration, which is rooted and (apparently) has some extra tools installed.

Installing it is a little more tricky. The way they want you to do this is put the pandaboard into a special mode and copy the stuff you built onto an sdcard. Seem a little funny to you? Yeah, it does to me too. Why not just build an sdcard image directly?
Nonetheless, this is the officially supported way of imaging a pandaboard, so let’s just follow it until we can think of a better way of doing things. :) The instructions for doing this on the pandaboard are located in the source tree here:

device/ti/panda/README

These are mostly correct as far as I can tell, but there’s a few gotchas. First, you need to run the commands mentioned as root unless you’ve configured USB to be configurable by your user. Second, most of those commands are not in the path by default so you’ll need to specify the full path to e.g. the fastboot utility. The instructions here cover these exception cases: I recommend following them instead.

One thing which neither document mentions is that you really need to make sure your sdcard is wiped completely clean before using fastboot. The “oem format” step only recreates the partition table, it doesn’t delete any corrupted partitions. If you reboot while these are still in place, it will try to bring up your corrupted version of Android, not the fastboot console. I spent quite some time debugging why I couldn’t properly flash the operating system before realizing this. Easiest way to get around this is to dd /dev/zero onto the sdcard before beginning the flashing process.

Also, while not strictly necessary to get something up and running, I recommend highly getting an HDMI monitor as well as a serial<->USB adapter. The former is useful to see if your Android device actually successfully booted up, the latter is useful for debugging boot issues where you don’t get that far (the serial console is always available from boot).

So, after painfully learning about the above caveats, I have managed to get things mostly working. I can see the ICS homescreen on my attached HDMI monitor and interact with it if I attach a USB mouse. The one gotcha is that both ethernet and WIFI networking are totally broken. Plugging in an ethernet cable or connecting to a WIFI network seems to result in the machine randomly rebooting, with the logs saying nothing useful. Both of these things are ostensibly supposed to be working according to the latest I’ve read from Google so I’m not exactly sure what’s going on. Investigations will continue.

Eideticker areas to explore

So I got some nice feedback on my Eideticker post yesterday on various channels. It seems like some people are interested in hacking on the analysis portion, so I thought I’d give some quick pointers and suggestions of things to look at.

  1. As I mentioned yesterday, the frame analysis is rather stupid. We need to come up with a better algorithm for disambiguating input noise (small fluctuations in the HDMI signal?) from actual changes in the page. Unfortunately the breadth of things that Eideticker’s meant to analyze makes this a bit difficult. I.e. edge detection probably wouldn’t work for something like Microsoft’s psychedelic browsing demo. I suspect the best route here is to put some work into better understanding the nature of this “noise” and finding a way to filter it out explicitly.
  2. Our analysis code is still rather slow, and is crying out to be parallelized (either by using multiple cores of the same CPU or a GPU). Burak Yiğit Kaya recommended I look into PyCuda which looks interesting. It looks like there are other possibilities as well though.
  3. Clipping capture by green screen/red screen. This should be doable by writing some relatively simple code to detect large amounts of green and red and then ignoring previous/current/subsequent frames as appropriate.
  4. Moar test cases! It was initially suggested to use some of the classic benchmarks, but these only seem to barely work on Fennec (at least with the setup I have). I don’t know if this is fixable or not, but until it is, we might be better off coming up with more reasonable/realistics measures of visual performance.

You might be able to find other inspiration on the Eideticker project page (note that some of this is out of date).

You obviously need the decklink card to perform captures, but the analysis portion of Eideticker can be used/modified on any machine running Linux (Mac should also work, but is untested). To get up and running, just follow the instructions in README.md, dump a pregenerated capture into the captures/ directory (here’s one of a clock), and off you go! The actual analysis code (such as it is) is currently located in src/videocapture/videocapture/capture.py while the web interface is in https://github.com/mozilla/eideticker/blob/master/src/webapp.

I’m going to be out later today (Friday), but I’m mostly around on IRC M-F 9ish-5ish EST on irc.mozilla.org #ateam as `wlach`. Feel free to pester me with questions!

P.S. I didn’t really cover infrastructure/automation portions above as I suspect people will find that less interesting (especially without a video capture card to test with), but you can look at my newsgroup post from yesterday if you want to see what I’ll likely be up to over the next few weeks.

Eideticker update

Since I last blogged about Eideticker, I’ve made some good progress. Here’s some highlights:

  1. Eideticker has a new, much simpler harness and tests are much easier to write. Initially, I was using Talos for this task with the idea that it’s better not to have duplicate code where it’s not really required. Seemed like a fine idea in principle, but in practice Talos’s architecture (which is really oriented around running a large sequence of tests and uploading the results to a central server) was difficult to extend to do what we need to do. At heart, eideticker really only needs to do a few things right now (start up Firefox, start videocapture, load a webpage, stop videocapture) so it’s best to keep things simple.
  2. I’ve reworked the capture analysis API to use numpy behind the scenes. It’s still not quite as fast as I would like (doing a framediff analysis on a 30 second animation still takes a minute or so on my fast machine), but we’re doing an order of magnitude better than before. numpy also seem to have quite the library of routines for doing the types of matrix algebra useful in image analysis, which should be helpful as the project progresses.
  3. I added the beginnings of a fancy pants web interface for browsing captures and doing visualizations on them! I’m pretty happy with how this is turning out so far, it’s already been an incredibly useful tool for debugging Eideticker’s analysis system and I think it will be equally useful for understanding Firefox’s behaviour in general.

Here’s an example analysis session, where I examine a ~60 second capture of the fishtank demo from Microsoft, borrowed from Mark Cote’s speedtest library. You might want to view this fullscreen:

A few interesting things to note about this capture:

1. Our frame comparison algorithm is still comparatively dumb, it just computes the norm of the difference in RGB values between two frames. Since there’s a (very tiny) amount of noise in the capture, we have to use a threshold to determine whether two frames are the same or not. For all that, the FPS estimate it comes with for the fishtank demo seems about right (and unfortunately at 2 fps, it’s not particularly good).
2. I added a green screen / red screen at the start / end of every capture to eliminate race conditions with starting the capture, but haven’t yet actually taken those frames out of the analysis.
3. If you look carefully at the animation, not all of the fish that should be displaying in the demo are. I think this has to do with the new native version of Fennec that I’m using to test (old versions don’t exhibit this property). I filed a bug for this.

What’s next? Well, as I mentioned last time, the real goal is to create a tool that developers will find useful. To that end, we have plans to set up an Eideticker machine in Mozilla Mountain View office that more people can use (either locally or remotely over the VPN). For this to be workable, I need to figure out how to get the full setup working on “demand”. Most of the setup already allows this, with one big exception: the actual Android device that we want to capture video from. The LG G2X that I’m currently using works fine when I have physical access to it, but as far as I can tell it’s not possible to get it outputting proper video of an application unless it’s in an unlocked state (which it obviously isn’t most of the time).

My current thinking is that a Panda Board running a Vanilla version of Android might be a good candidate for a permanently-connected device. It is capable of HDMI output, doesn’t have unwanted the bells and whistles of a physical phone (e.g. a lock screen), and should be much reliable due to its physical networking. So far I haven’t had much luck getting it the video output working with the Decklink capture card, but I’ve only just started trying. Work will continue.

If I can somehow figure that out, and smooth out some of the rough edges with the web interface and capture API, I think the stage will be set for us all to do some pretty interesting stuff! Looking forward to it.

A better BIXI web site

There’s much to like about the BIXI bike-sharing system in Montréal: it’s affordable ($78 for a year of biking), accessible and fun to use. There’s absolutely no doubt in my mind that it’s made cycling more of a main stream activity here in Montreal, which benefits everyone (even drivers indirectly gain from less congested streets).

With the arrival of the first BIXI stations in NDG, I decided to subscribe to it this year even though I have a bike of my own. So far, it looks like I’m going to easily use it enough to justify the cost. I still use my regular bike for my commute from NDG to the Plateau, but on the edges there’s a ton of cases where it just makes sense to use something that I don’t have to worry about locking up and returning home. Sometimes I only want to go one way (for weather or whatever other reason). Other times I want to take public transit for one leg of my trip (or day), but need/want to take a quick jaunt elsewhere once I’m downtown.

I do have to say though, their new web site drives me crazy. I’ve thought prety deeply about the domain of creating user-friendly transit-focused web sites, so I think I can speak with some authority here.

Leaving aside it’s value as a promotional tool for the service itself (not my area of expertise), the experience of trying to find a nearby station is complicated by a slow, multi-layered UI that requires repeated clicking and searching to find the nearest station that has bikes available. Why bother with this step when we can just display this information outright on the map? iPhone applications like Bixou have been doing this for years. It’s time we brought the same experience to the desktop.

Thus, I present nixi.ca: a clean, useable interface to BIXI’s bike share system that presents the information you care about as effectively as possible, without all the clutter. I’ve already found it useful, and I hope you do too. Think you can do better? Fork the source on github and submit your changes back to me! Minus some glue code to fetch station info server side, it’s entirely a client-side application written in HTML/JavaScript.

Note that the site uses a bunch of modern HTML5 features, so currently requires a modern browser like Firefox, Chrome, or Safari to display properly. I may or may not fix this. Other notable omissions include support for other cities with the BIXI system (Toronto, Ottawa, …) and French localization. Patches welcome!

Back to WordPerfect: libwpd 0.9.0

Those who’ve known me for a while have probably heard about my first major open source project, libwpd. In a nutshell, it’s a parser for WordPerfect documents with the primary aim of converting them into something usable by the major opensource office programs out there. It’s used by LibreOffice, OpenOffice.org, AbiWord, and KOffice. WordPerfect isn’t the most popular word processor out there, but there’s still quite a number of legacy documents in that format, especially in the legal community (which was almost exclusively using WordPerfect until very recently).

This project goes way back: I started work on it with Marc Maurer way back in 2002 (just after I graduated from University). I put a rather ridiculous amount of unpaid work into it for a few years. WordPerfect’s streaming document format is a bit esoteric to say the least, and figuring out how to map into the document model used by more modern software was a pretty interesting problem. I still remember spending sleepless nights trying to reliably convert WordPerfect’s outlining into structured lists (I mostly succeeded).

Since then, I’ve mostly moved on to other things, leaving the project in the capable hands of Fridrich Strba, who’s been steadily working on adding a number of important features to the library that massively improve import fidelity. I did have time this summer to add page numbering support (thanks to Yam Software for sponsoring that work) and move the project over to git from cvs, but for the most part it’s been his show since late 2004.

Even if I’m not as actively involved as I once was, when there’s major developments, I still get excited (perhaps in the way that a parent might about a child who’s left the household). And yesterday brought something pretty big: libwpd 0.9.0. With this release, we finally supports graphics (thanks to the work of Fridrich and Ariya Hidayat on libwpg), notes, the page numbering that I mentioned above, and support for encrypted documents. It’s a big deal. Here’s some before and after screenshots:


libwpd_screenshot_graphic_0.8_thumb

libwpd_screenshot_graphic_0.9_thumb

libwpd_screenshot_note_0.8_thumb

libwpd_screenshot_note_0.9_thumb

libwpd_screenshot_page_numbering_0.8_thumb

libwpd_screenshot_note_0.9_thumb

All this goodness should be available transparently whenever you import a WordPerfect file in an upcoming release of LibreOffice. AbiWord and KOffice filters should come soon enough as well (the updates needed to support libwpd 0.9 are fairly minimal).

Integration with OpenOffice.org is another story. Without going into great amount of detail on the situation (see this article on Ars Technica for the gory details if you’re really interested), it’s quite unlikely that OpenOffice.org WordPerfect support will advance unless (1) someone volunteers to do it and (2) Oracle drops their copyright assignment policy. The chances of these things happening seem rather low to me. My personal recommendation would be to switch to LibreOffice as soon as the first production version is released. I expect it to rapidly overtake OpenOffice.org in functionality due to its more open participation model.

Oh, the lines! And some help for Edmonton.

Map with new link nodes

A productive day on the transit development front. Finished up a few big features related to hbus.ca and Transit to Go:

  • Sped up the graph and database generation by an order of magnitude. Not too exciting from a user perspective, but I should now be able to iterate the codebase much faster.
  • Better transit stop / street graph linking: No more does libroutez simply try to find the closest street level vertice to link to when merging transit stops with street information– we now actually create _new_ street level vertices as needed and link to those. Upshot? Slightly better directions and prettier polylines. When I first thought up the algorithm a month ago, I thought I was totally brilliant, only to later find out that Andrew Byrd had done something almost identical a few months earlier for graphserver. Ah well, at least it’s implemented.
  • I coded up a script to automatically generate synthetic headsigns for GTFS feeds which don’t have them. This was needed to provide a sensible view for the Edmonton version of Transit to Go. All the props in the world for opening up your data guys, but can’t you do better than saying that all your buses go in the “1″ direction? There’s a reason why it’s a required field you know. Not only would it help me, but Google Transit would give better results for your city as well.