Category Archives: Technical Entries

Oh, the lines! And some help for Edmonton.

Map with new link nodes

A productive day on the transit development front. Finished up a few big features related to hbus.ca and Transit to Go:

  • Sped up the graph and database generation by an order of magnitude. Not too exciting from a user perspective, but I should now be able to iterate the codebase much faster.
  • Better transit stop / street graph linking: No more does libroutez simply try to find the closest street level vertice to link to when merging transit stops with street information– we now actually create _new_ street level vertices as needed and link to those. Upshot? Slightly better directions and prettier polylines. When I first thought up the algorithm a month ago, I thought I was totally brilliant, only to later find out that Andrew Byrd had done something almost identical a few months earlier for graphserver. Ah well, at least it’s implemented.
  • I coded up a script to automatically generate synthetic headsigns for GTFS feeds which don’t have them. This was needed to provide a sensible view for the Edmonton version of Transit to Go. All the props in the world for opening up your data guys, but can’t you do better than saying that all your buses go in the “1″ direction? There’s a reason why it’s a required field you know. Not only would it help me, but Google Transit would give better results for your city as well.

Nexus One

Oh glorious day, the Nexus One is now available for purchase in Canada!

I’ve been feeling less and less enthusiastic about the iPhone lately, in particular after the ridiculous lawsuit against HTC. It’s no secret that we at Navarra have been doing quite a bit of work on that platform, as have my associates at Mindsea. As long as there’s demand from our clients, that won’t change, but as an individual I’m feeling less and less enthusiastic about supporting a company that (through its actions) demonstrates hostility to the ideals of autonomy and innovation that I hold dear. Now that an attractive alternative is available on reasonable terms, I’m seriously considering switching horses in the not-too-distant future.

Adventures in processing with prender

First, I’m overdue in announcing Transit to Go a.k.a. “the iPhone transit map that’s demonstrably more useful than a paper schedule” a.k.a. “your bus departure in 15 seconds or less, no matter where you are”. I wrote up a blog post about it for Mindsea‘s site, if you’re interested in finding out more.

Second, all this transit excitement has made me start thinking about better routing and geometry algorithms again. I’ve been experimenting a bit with Brandon Martin Anderson’s prender framework, used by the infamous Graphserver, and have been pretty happy with the results. It basically lets you do processing visualizations in python (i.e. no Java coding required). Here’s a quick picture of it in action, rendering the Nova Scotian road network, as distributed by geobase.

Nova Scotia as rendered by prender

The neat thing about this framework is that you can render quickly to an arbitrary level of detail, which should prove very useful when troubleshooting the behavior of some of the code I’m working on. If anyone is interested in running the framework on MacOS X (like I was), my fork of the project has the appropriate patches.

It’s alive!


>>> import neocoder
>>> g = neocoder.GeoCoder("greater-hrm2.db")
>>> g.get_latlng("14 Johnson, North Preston")
SQL: select firstHouseNumber, lastHouseNumber, length, coords from road where name like 'Johnson' and firstHouseNumber <= '14' and lastHouseNumber >= '14' and placeName like 'North Preston' limit 1
(44.73895263671875, -63.464725494384766)

Projects for 2010

Hurrah, 2010 is upon us!

One new years resolution I have set for myself is to blog more about what I’m working on. I’ve learned over the last year that the audience of people who care about your projects in development is vanishingly small. Thus, the need for secrecy in order to make a “PR splash” is rather small– announce far and wide when you have something that people can use by all means, but don’t worry too much about talking about what you’re working on with the internets.

In this spirit, some projects I’m 99% certain I’ll be releasing publicly in 2010:

  • neocoder A lightweight geocoding library, with wrapper libraries for your language of choice. Written in C++ using SQLite and boost regular expressions. Will support both OpenStreetMap and GeoBase GML as input. Currently in development on github, hoping to release with routez (as its geocoding component).
  • routez A generic travel planning web service, written in python using the django framework and the libroutez libraries. This is basically the software behind hbus.ca… the goal for 2010 is to clean it up and make it generic by clearing out the Halifax-specific stuff (mostly just the geocoding and site theme stuff at this point), then release it to the public under the Affero GPL License (was originally going to with GPL, but Simon Law convinced me otherwise… more on that in a future post).
  • Transit To Go A dedicated iPhone client for the routez software, developed in collaboration with Dmitri Dolguikh and Bill Wilson, two talent developers from Halifax. Has some innovative (in my opinion, anyway) details on how things will be viewed. This one’s going to be proprietary, but will be affordable and awesome.

Besides this, I have a few more irons in the fire… however, I’m hesistant to talk about them just yet. Just getting the above done in the midst of my work with Navarra (to say nothing of having a life in there somewhere) is going to be challenging.

Thoughts? Would be particularly interested in hearing from people working on similar projects to neocoder and routez. Despite how it may some times appear, I don’t have a NotInventedHere mentality: I’ve done as an exhaustive survey of the field as I could before deciding to work on my own projects, and what I’ve found just hasn’t been the right fit for what I’m trying to accomplish. However, the world’s gotten so damn big that I’m not sure if I’m missing something…

iPhone hackathon 4 charity: Halifax edition

I think it’s fair to say that Halifax’s first iPhone hackathon for charity was a big success. The idea was pretty simple: get a group of people (developers, marketers, artists) together over a weekend and try to produce as many iPhone apps as possible over the course of a weekend. Sell the apps on the app store (or otherwise monetize them), then donate the proceeds to charity.

I think we managed to get a group of about 15 together. After the weekend was over, we had three apps in various stages of completion. They are:

  • PostCard: Send post cards, with a local twist.
  • Meet me here: A streamlined way to tell your friends where you are.
  • Civic Snitch: Report on problems in your neighbourhood using your phone. A front end to the amazing fixmystreet.ca (this is the one I worked on).

As usual for a hackfest, the energy level was amazing. In addition to seeing the familiar faces of MindSea, Applied Logic, Hand Puppet and Say Hi There, it was fantastic to meet the new faces at North Knight and an amazing group of unaffiliated (yet crazy competent) developers. A weekend is a bit too short a time to do anything but a trivial iPhone application, but we got a good start on all of them. Rumor has it that the postcard application is quite close to completion. Another few hacking sessions and we should have some apps that are good for release.

It’s hard to do justice to the overwhelming feeling of WIN that came out of this. Since co-founding Navarra a year ago, I’ve been at a ton of conferences, hack weekends, and other networking events and this has by far been the one I’ve felt the best about. What made it so great?

  • First and foremost, the feeling that the work you’re doing will be used for good.
  • The opportunity to take part in something untested and different. In these difficult times, charities are looking for new ways to fill gaps in fundraising– can software developers help?
  • The Halifax Hub‘s open concept space which did so much to facilitate collaboration and communication (as it always does).
  • The amazing catered food from Local Source Organic (Splice Training also provided some tasty home-baked cookies).
  • The awesome high-quality t-shirts, featuring an amazing design by Nick Brunt (printing courtesy of Mindsea).
  • The free massages from Be Wellness.
  • DJ Rich.Ness spinning tunes for us to enjoy all of Saturday night

So what’s next? Well, that’s something we’re working out with a lawyer. :) The idea is to create some kind of legal structure that allows us to safely collect any app store proceeds and get them sent to charity, though we haven’t yet finalized on what that will look like. The hope is that we can create a model that can be reused in other cities (iHackMTL anyone?).

Likewise, the final decision on which local charities will be receiving the proceeds has not yet been made. Something like ten organizations submitted proposals before the hackfest. It’s great to see so much interest, but it’s obviously not possible to accomodate everyone this round. It’s fair to say that at least one app will be going directly to an organization which helps in some way to address poverty in the HRM. I think there’s a collective understanding among the participants at the hackfest that we’ve been quite blessed by circumstance and good fortune and that there’s a responsibility to help those who haven’t been so lucky.

As for the apps themselves, the plan is to put the source up on github ASAP under the MIT License. I’ll be sure to post an announcement when this happens (though this is of course only of interest to the hardcore geeks).

Thanks again to the participants and the sponsors (The Hub, Local Source Organic, Be Wellness, Splice Training, Say Hi There!, Mindsea, innovacorp, Nova Scotia Rural BroadBand and Development, Humina Huminah) for the amazing weekend. Most especially, Dale Zak, the event organizer (and happy hacker) deserves huge kudos for the amazing idea and the perseverance to make it happen.

Creating a google transit feed for fun and profit

People frequently ask me how I manage to collect and input the data that is used by hbus.ca, given Metro Transit’s intransigence. The “bike and GPS” angle is well known by now, but what about the rest of the process? How do I get the data into a format that hbus.ca can consume?

The defacto standard for the interchange of transit information is Google Transit Feed (GTFS). This exceedingly simple comma seperated value format is now supported by a plethora of software, including Google Transit, graphserver, as well as my very own libroutez (used by hbus.ca). It was obvious to me right from the beginning that the first step to building hbus.ca would be to create one of these feeds.

Manipulating a GTFS by hand is probably not a great idea. It’s basically a dump of a relational database, and is pretty inscrutable from the point of view of a human being. What I really want to be able to do is be able
to manipulate things on the level of stops, service periods, and routes– and let some kind of abstraction layer take care of the low-level details. Fortunately, the awesome engineers at google created a python library called Google Transit Data Feed, which can help with creating one of these things by providing abstractions of the key elements of a google transit feed (stops, service periods, etc.). You can then write a program which uses these abstractions to create and save a GTFS.

Of course, providing the library appropriate information is easier said than done. Metro Transit’s PDF schedules are not readily computer parsable (being designed to be printed out, after all). I needed some kind of semi-automated way of converting a Metro Transit schedule into GTFS, or this whole project was
going nowhere fast.

As an initial step, it turns out that it’s quite possible to extract textual information from a PDF using the open source popplar library. From there, it’s possible to extract the stopping times for an individual bus route. Let’s give an example. For example, let’s take the case of adding the 60 (Portland Hill’s route), something I’m currently working on. All I had to do was download the PDF file from Metro Transit’s site and then run the following on the command line:

pdftotext -raw route60.pdf

The raw option basically makes sure the raw strings are dumped to disk, and that no attempt is made to preserve formatting. The result is a text file with content like this in it:

842a 847a 855a 858a 903a 906a 912a -
857a 902a 910a 913a 918a 921a - 925a
910a 915a 923a 926a 931a 934a 940a -
940a 945a 953a - 1000a 1003a 1009a -
...and every 30 minutes until
210p 215p 223p - 230p 233p 239p -

This type of format can be parsed easily enough. To create a proper transit feed though, schedule information isn’t enough: you also need to know the locations of the stops, names of routes, etc. After some deliberation, I came to the determination that I needed some kind of intermediate format to store the above schedule information and this additional information. It would be readable both by humans (to ease its creation) and machines.

The obvious markup for something like this is YAML (if you’re still using XML to store structured information, run, don’t walk, and look at YAML: you can thank me later). Simple, clean, effective. GTFS is still the better choice for using the information in another application as its representation is much more amenable to being stored in a graph. Here’s a few examples of my YAML format in action:

7 (Robie to Gottingen)
10 (Westphal)

Besides the scheduling information, the other main interesting component of a GTFS is the location of the stops. As anyone who’s used a Metro Transit schedule has noticed, only major timepoints are covered in the PDF schedules. What of all the stops in between? This is where the bike and GPS come in.

What I did was take a standard GPS from Mountain Equipment Co-op (The Garmin GPSMap 60x), get on my bike, take the readings of individual gotime numbers and positioning information, of the individual stops between the major timepoints. I then took this device back to my computer and, using a utility called GPSBabel, dumped out the stop information in a format called “comma seperated value”. It looks like this:

44.65825, -63.59252, 6785-21-31-33-34-35-3-7
44.65982, -63.59452, 6768-21-31-33-35-86-3-7
44.66113, -63.59659, 6782-21-31-33-34-35-3-7

The first two items are latitude and longitude, providing the positioning of the stop. The last item is a gotime number, followed by the set of buses which pass by the stop. Turning this into YAML is a matter of applying
the following regular expression to the input:

\([0-9]+.[0-9]+\), \(-63.[0-9]+\), \([0-9]+\)- -> - { name: xxx, stop_code: \3, lat: \1, lng: \2 }

To get an actual name for the stop (i.e.: “Gottingen and Young”), I wrote a simple script which finds the nearest intersection close to the stop in the GeoBase dataset. I then (at my discretion) corrected it based on my on-the-street knowledge of the layout of Halifax as well as adding certain details to help the user (e.g. bus stops on the way to the south end of Halifax are marked “south bound”).

With these two elements in place (a format for creating human-readable transit information and a library for creating GTFS), the only thing left to do is create a program which bridges the gap. Behold, the magic of
createfeed.py. With all of this in place, creating a google transit feed for Halifax is a simple matter of typing “make”.

Is this a ridiculous amount of work? I wouldn’t say so. The vast, vast majority of my work on hbus.ca has been in creating the pathfinding code and geocoding functionality. This is work that can be translated to many different municipalities, and can easily be extended and made more useful in a myriad of ways.

What does seem a little intimidating to me is completing what I started. Capturing bus stop information for the Halifax peninsula is one thing, but covering the outlying areas (Bayer’s Lake, Sackville, etc.) is quite
another. There’s a lot of biking involved there, more perhaps than what one person can reasonably be expected to do. It was my hope that the initial release of hbus would validate the model of community-developed transit software to Metro Transit and they would see the benefit of releasing their internal copy of this data to the public, but unfortunately that doesn’t seem to have happened.

Getting that problem solved seems to be more a political problem than a technical one, and it’s not my specialty. It really does make me wonder if I shouldn’t reconsider the option of crowd sourcing, which I had
rejected earlier.

Maps URLs on mobile Safari

I’ve been experimenting a little bit with maps urls on the iphone. If you’ve read Apple’s web developer guidelines, you’ll know that URLs of this form will automatically redirect to the maps application:

Halifax, Nova Scotia
<a href=”http://maps.google.com/maps?geocode=&q=Halifax,Nova Scotia”>Halifax, Nova Scotia</a>

This is fine if you just want to highlight one particular location (with no custom metadata), but what if you want to do something more interesting, like display a KML file? You can load these easily from the maps application, so why can’t you link to them from a web browser? The URL guidelines explicitly say that the KML part of a query string will be discarded, and indeed it is. What is a web developer to do? Resort to undocumented behaviour, of course! At least in version 2.2 of the iphone software, URLs which request a “maps” resource with the appropriate parameters will automatically load the appropriate KML file in the maps application:

Map link
<a href=”maps://?geocode=&q=http://code.google.com/apis/kml/documentation/KML_Samples.kml”>Map link</a>