Category Archives: Python

mozregression: New maintainer, issues tracked in bugzilla

Just wanted to give some quick updates on mozregression, your favorite regression-finding tool for Firefox:

  1. I moved all issue tracking in mozregression to bugzilla from github issues. Github unfortunately doesn’t really scale to handle notifications sensibly when you’re part of a large organization like Mozilla, which meant many problems were flying past me unseen. File your new bugs in bugzilla, they’re now much more likely to be acted upon.
  2. Sam Garrett has stepped up to be co-maintainer of the project with me. He’s been doing a great job whacking out a bunch of bugs and keeping things running reliably, and it was time to give him some recognition and power to keep things moving forward. :)
  3. On that note, I just released mozregression 0.17, which now shows the revision number when running a build (a request from the graphics team, bug 1007238) and handles respins of nightly builds correctly (bug 1000422). Both of these were fixed by Sam.

If you’re interested in contributing to Mozilla and are somewhat familiar with python, mozregression is a great place to start. The codebase is quite approachable and the impact will be high — as I’ve found out over the last few months, people all over the Mozilla organization (managers, developers, QA …) use it in the course of their work and it saves tons of their time. A list of currently open bugs is here.

PyCon 2014 impressions: ipython notebook is the future & more

This year’s PyCon US (Python Conference) was in my city of residence (Montréal) so I took the opportunity to go and see what was up in the world of the language I use the most at Mozilla. It was pretty great!

ipython

The highlight for me was learning about the possibilities of ipython notebooks, an absolutely fantastic interactive tool for debugging python in a live browser-based environment. I’d heard about it before, but it wasn’t immediately apparent how it would really improve things — it seemed to be just a less convenient interface to the python console that required me to futz around with my web browser. Watching a few presentations on the topic made me realize how wrong I was. It’s already changed the way I do work with Eideticker data, for the better.

Using ipython to analyze some eideticker data
Using ipython to analyze some eideticker data

I think the basic premise is really quite simple: a better interface for typing in, experimenting with, and running python code. If you stop and think about it, the modern web interface supports a much richer vocabulary of interactive concepts that the console (or even text editors like emacs): there’s no reason we shouldn’t take advantage of it.

Here are the (IMO) killer features that make it worth using:

  • The ability to immediately re-execute a block of code after editing and seeing an error (essentially merging the immediacy of the python console with the permanency / cut & pastability of an actual script)
  • Live-printing out graphs of numerical results using matplotlib. ZOMG this is so handy. Especially in conjunction with the live-editing outlined above, there’s no better tool for fine-tuning mathematical/statistical analysis.
  • The shareability of the results. Any ipython notebook can be saved and then saved to a public website. Many presentations at PyCon 2014, in fact, were done entirely with ipython notebooks. So handy for answering questions like “how did you get that”?

To learn more about how to use ipython notebooks for data analysis, I highly recommend Julie Evan’s talk Diving into Open Data with IPython Notebook & Pandas, which you can find on pyvideo.org.

Other Good Talks

I saw some other good talks at the conference, here are some of them:

  • All Your Ducks In A Row: Data Structures in the Standard Library and Beyond – A useful talk by Brandon Rhoades on the implementation of basic data structures in Python, and how to select the ones to use for optimal performance. It turns out that lists aren’t the best thing to use for long sequences of numerical data (who knew?)
  • Fast Python, Slow Python – An interesting talk by Alex Gaynor about how to write decent performing pure-python code in a single-threaded context. Lots of intelligent stuff about producing robust code that matches your intention and data structures, and caution against doing fancy things in the name of being “pythonic” or “general”.
  • Analyzing Rap Lyrics with Python – Another data analysis talk, this one about a subject I knew almost nothing about. The best part of it (for me anyway) was learning how the speaker (Julie Lavoie) narrowed her focus in her research to the exact aspects of the problem that would let her answer the question she was interested in (“Can we automatically find out which rap lyrics are the most sexist?”) as opposed to interesting problems (“how can I design the most general scraping library possible?”) that don’t answer the question. In my opinion, this ability to focus is one of the key things that seperates successful projects from unsuccessful ones.

Catching problems early with python

Just a few quick notes on how to avoid a class of errors I’ve been seeing in Mozilla’s automation over the last year. Since python interprets code dynamically, it’s pretty easy for problems like undefined variables to slip through, especially if they’re in a codepath that isn’t frequently tested. The most recent example I found was in some cleanup-after-error code for remote mochitest/reftest, which tried to call “self.cleanup” from a standalone method.

def main():
      ...
      try:
        dm.recordLogcat()
        retVal = mochitest.runTests(options)
        logcat = dm.getLogcat()
      except:
        print "TEST-UNEXPECTED-FAIL | %s | Exception caught while running tests." % sys.exc_info()[1]
        mochitest.stopWebServer(options)
        mochitest.stopWebSocketServer(options)
        try:
            self.cleanup(None, options)
        except:
            pass

testing/mochitest/runtestsremote.py

We’re calling cleanup as if it were a class variable, but we’re not inside any class! It’s easy to see what will happen if you try to run some similar code from the python console:

>>> self.cleanup()
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'self' is not defined

However, because we’re in a blanket try…except, we will never see an error. The cleanup code will never be called, instead the exception is immediately caught and subsequently ignored. Probably not the end of the world in this case (there are other parts of our mobile automation which will perform the same cleanup later), but it’s easy to imagine where this would be a more serious problem.

There’s two very easy ways to help stop errors like this before they hit our code:

  1. Try to avoid using a blanket try…except. In addition to catching legitimate problems which we want to ignore (in the remote case for example, devicemanager exceptions), it also catches (and thus obscures) things like syntax, name, or type errors. Instead, try just catching the specific exception you’re looking for. For example, we might rewrite the case above as:

    try:
      mochitest.cleanup(None, options)
    except devicemanager.DMError:
      print "WARNING: Device error while cleaning up"
    
  2. pyflakes, pyflakes, pyflakes. Pyflakes is a fantastic tool for analyzing your python code for common problems. It’s kind of analagous to jslint, for those of you familiar with that. Here’s what happens when we run pyflakes against the offending code:

    wlach@eideticker:~/src/mozilla-central$ pyflakes testing/mochitest/runtestsremote.py 
    testing/mochitest/runtestsremote.py:7: 'time' imported but unused
    testing/mochitest/runtestsremote.py:481: undefined name 'self'
    testing/mochitest/runtestsremote.py:500: undefined name 'self'
    

    I’ve found pyflakes to be an indispensable part of my workflow. I generally run it after making any substantial change to a python file, and certainly before pushing anything to be consumed by others.

Ultimately there’s no substitute for actually thoroughly testing your code, no matter what language you’re using. But using the right techniques and tools can certainly make your life easier. :)

[ for those wondering, a fix for the issue mentioned in this post is part of bug 801652 ]