Those who’ve known me for a while have probably heard about my first major open source project, libwpd. In a nutshell, it’s a parser for WordPerfect documents with the primary aim of converting them into something usable by the major opensource office programs out there. It’s used by LibreOffice, OpenOffice.org, AbiWord, and KOffice. WordPerfect isn’t the most popular word processor out there, but there’s still quite a number of legacy documents in that format, especially in the legal community (which was almost exclusively using WordPerfect until very recently).
This project goes way back: I started work on it with Marc Maurer way back in 2002 (just after I graduated from University). I put a rather ridiculous amount of unpaid work into it for a few years. WordPerfect’s streaming document format is a bit esoteric to say the least, and figuring out how to map into the document model used by more modern software was a pretty interesting problem. I still remember spending sleepless nights trying to reliably convert WordPerfect’s outlining into structured lists (I mostly succeeded).
Since then, I’ve mostly moved on to other things, leaving the project in the capable hands of Fridrich Strba, who’s been steadily working on adding a number of important features to the library that massively improve import fidelity. I did have time this summer to add page numbering support (thanks to Yam Software for sponsoring that work) and move the project over to git from cvs, but for the most part it’s been his show since late 2004.
Even if I’m not as actively involved as I once was, when there’s major developments, I still get excited (perhaps in the way that a parent might about a child who’s left the household). And yesterday brought something pretty big: libwpd 0.9.0. With this release, we finally supports graphics (thanks to the work of Fridrich and Ariya Hidayat on libwpg), notes, the page numbering that I mentioned above, and support for encrypted documents. It’s a big deal. Here’s some before and after screenshots:
All this goodness should be available transparently whenever you import a WordPerfect file in an upcoming release of LibreOffice. AbiWord and KOffice filters should come soon enough as well (the updates needed to support libwpd 0.9 are fairly minimal).
Integration with OpenOffice.org is another story. Without going into great amount of detail on the situation (see this article on Ars Technica for the gory details if you’re really interested), it’s quite unlikely that OpenOffice.org WordPerfect support will advance unless (1) someone volunteers to do it and (2) Oracle drops their copyright assignment policy. The chances of these things happening seem rather low to me. My personal recommendation would be to switch to LibreOffice as soon as the first production version is released. I expect it to rapidly overtake OpenOffice.org in functionality due to its more open participation model.
Due to some schedule adjustments, it appears as if Metro Transit’s Google Transit is temporarily out of service while they wait for Google to process their new transit feed.
However, fear not! Unlike Google, obsessive compulsive computer programmer William Lachance works on weekends for free (or the low, low price of $1.99 in the case of iPhone applications) so your thirst for updated transit information can be satisfied. I link to these things often enough on this blog, but here they are again, for the benefit of first time viewers:
- Transit to Go In other news, it appears as if Metro Transit has finally seen the light and made their transit information available to anyone who wants it. I would like to think that I might have had something to do with this, but I honestly have no idea. Anyway, if you want to check it out, it’s available here.
After a month long hiatus, I finally had a chance to get hbus.ca back into a working state. The big difference is that we’re now using the official HRM transit data. Lots more remains to be done to make this site competitive for 2010 (most of the recent work has been back-end infrastructure stuff), but at least it’s usable again.
With yesterday’s work out of the way, there weren’t too many extra steps required before I got a basic version of Transit to Go running for Edmonton. There are definitely bugs and rough edges (the bus names could definitely be better described/formatted, and there’s some serious geocoder issues), but I think the heavy lifting is out of the way. I guess now would be a good time to open up an invitations to anyone in Edmonton with an iPhone to become part of our free private beta. We’d love to hear what you think! Just send an email to email@example.com.
A productive day on the transit development front. Finished up a few big features related to hbus.ca and Transit to Go:
- Sped up the graph and database generation by an order of magnitude. Not too exciting from a user perspective, but I should now be able to iterate the codebase much faster.
- Better transit stop / street graph linking: No more does libroutez simply try to find the closest street level vertice to link to when merging transit stops with street information– we now actually create _new_ street level vertices as needed and link to those. Upshot? Slightly better directions and prettier polylines. When I first thought up the algorithm a month ago, I thought I was totally brilliant, only to later find out that Andrew Byrd had done something almost identical a few months earlier for graphserver. Ah well, at least it’s implemented.
- I coded up a script to automatically generate synthetic headsigns for GTFS feeds which don’t have them. This was needed to provide a sensible view for the Edmonton version of Transit to Go. All the props in the world for opening up your data guys, but can’t you do better than saying that all your buses go in the “1” direction? There’s a reason why it’s a required field you know. Not only would it help me, but Google Transit would give better results for your city as well.
Because I don’t have enough spare time projects (this is a joke), I decided to take on the task of adding a must-requested special iPhone/iPod-touch friendly view for the WifiDog authentication server (used by the infamous Île Sans Fil) after being inspired by the WifiDog camp held a few weeks ago at Station C. I finally finally finished up a prototype today. It’s a bit of a hack, but a relatively clean one– hopefully some version of it can be integrated into trunk, and users with mobile devices will have a better experience when they’re on the go in Montréal (or any other area with a community-oriented wifidog deployment: I hear there’s lots).
For those interested in grubby details, you can track the progress of this work on the WifiDog ticket tracker.
Oh glorious day, the Nexus One is now available for purchase in Canada!
I’ve been feeling less and less enthusiastic about the iPhone lately, in particular after the ridiculous lawsuit against HTC. It’s no secret that we at Navarra have been doing quite a bit of work on that platform, as have my associates at Mindsea. As long as there’s demand from our clients, that won’t change, but as an individual I’m feeling less and less enthusiastic about supporting a company that (through its actions) demonstrates hostility to the ideals of autonomy and innovation that I hold dear. Now that an attractive alternative is available on reasonable terms, I’m seriously considering switching horses in the not-too-distant future.
First, I’m overdue in announcing Transit to Go a.k.a. “the iPhone transit map that’s demonstrably more useful than a paper schedule” a.k.a. “your bus departure in 15 seconds or less, no matter where you are”. I wrote up a blog post about it for Mindsea‘s site, if you’re interested in finding out more.
Second, all this transit excitement has made me start thinking about better routing and geometry algorithms again. I’ve been experimenting a bit with Brandon Martin Anderson’s prender framework, used by the infamous Graphserver, and have been pretty happy with the results. It basically lets you do processing visualizations in python (i.e. no Java coding required). Here’s a quick picture of it in action, rendering the Nova Scotian road network, as distributed by geobase.
The neat thing about this framework is that you can render quickly to an arbitrary level of detail, which should prove very useful when troubleshooting the behavior of some of the code I’m working on. If anyone is interested in running the framework on MacOS X (like I was), my fork of the project has the appropriate patches.
>>> import neocoder
>>> g = neocoder.GeoCoder("greater-hrm2.db")
>>> g.get_latlng("14 Johnson, North Preston")
SQL: select firstHouseNumber, lastHouseNumber, length, coords from road where name like 'Johnson' and firstHouseNumber <= '14' and lastHouseNumber >= '14' and placeName like 'North Preston' limit 1
Hurrah, 2010 is upon us!
One new years resolution I have set for myself is to blog more about what I’m working on. I’ve learned over the last year that the audience of people who care about your projects in development is vanishingly small. Thus, the need for secrecy in order to make a “PR splash” is rather small– announce far and wide when you have something that people can use by all means, but don’t worry too much about talking about what you’re working on with the internets.
In this spirit, some projects I’m 99% certain I’ll be releasing publicly in 2010:
- neocoder A lightweight geocoding library, with wrapper libraries for your language of choice. Written in C++ using SQLite and boost regular expressions. Will support both OpenStreetMap and GeoBase GML as input. Currently in development on github, hoping to release with routez (as its geocoding component).
- routez A generic travel planning web service, written in python using the django framework and the libroutez libraries. This is basically the software behind hbus.ca… the goal for 2010 is to clean it up and make it generic by clearing out the Halifax-specific stuff (mostly just the geocoding and site theme stuff at this point), then release it to the public under the Affero GPL License (was originally going to with GPL, but Simon Law convinced me otherwise… more on that in a future post).
- Transit To Go A dedicated iPhone client for the routez software, developed in collaboration with Dmitri Dolguikh and Bill Wilson, two talent developers from Halifax. Has some innovative (in my opinion, anyway) details on how things will be viewed. This one’s going to be proprietary, but will be affordable and awesome.
Besides this, I have a few more irons in the fire… however, I’m hesistant to talk about them just yet. Just getting the above done in the midst of my work with Navarra (to say nothing of having a life in there somewhere) is going to be challenging.
Thoughts? Would be particularly interested in hearing from people working on similar projects to neocoder and routez. Despite how it may some times appear, I don’t have a NotInventedHere mentality: I’ve done as an exhaustive survey of the field as I could before deciding to work on my own projects, and what I’ve found just hasn’t been the right fit for what I’m trying to accomplish. However, the world’s gotten so damn big that I’m not sure if I’m missing something…