Showing posts tagged Python

Python dependency gotchas: always go to the source

Aug 16th, 2021

Mozilla Python

Getting back into the swing of things at Mozilla after my extended break. I’m currently working on enhancing and extending Looker support for Glean-based applications, which eventually led me back to working on bigquery-etl, our framework for creating derived datasets in our data lake.

I spent some time working on improving the initial developer experience of bigquery-etl early this year, so I figured it would be no problem to get going again despite an extended hiatus from it (I think it’s probably been ~2–3 months since I last touched it). Unfortunately the first thing I got after creating a fresh virtual environment (to pick up the new dependency updates) was this exciting looking error:

wlach@antwerp bigquery-etl % ./bqetl --help
Traceback (most recent call last):
  ...
  File "/Users/wlach/src/bigquery-etl/venv/lib/python3.9/site-packages/google/cloud/bigquery_v2/types/__init__.py", line 16, in <module>
    from .encryption_config import EncryptionConfiguration
  File "/Users/wlach/src/bigquery-etl/venv/lib/python3.9/site-packages/google/cloud/bigquery_v2/types/encryption_config.py", line 26, in <module>
    class EncryptionConfiguration(proto.Message):
  File "/Users/wlach/src/bigquery-etl/venv/lib/python3.9/site-packages/proto/message.py", line 200, in __new__
    file_info = _file_info._FileInfo.maybe_add_descriptor(filename, package)
  File "/Users/wlach/src/bigquery-etl/venv/lib/python3.9/site-packages/proto/_file_info.py", line 42, in maybe_add_descriptor
    descriptor=descriptor_pb2.FileDescriptorProto(
TypeError: descriptor to field 'google.protobuf.FileDescriptorProto.name' doesn't apply to 'FileDescriptorProto' object

What I did

Since we have pretty decent continuous integration at Mozilla, when I see an error like this I am usually pretty sure it’s some kind of strange interaction between my local development environment and whatever dependencies we’ve specified for the repository in question. Usually these problems are pretty easy to solve.

First thing I tried was to type the error into Google, to see if this had come up for anyone else before. I tried several variations of TypeError: descriptor to field and FileDescriptorProto and nothing really turned up. This strategy almost always turns up something. When it doesn’t it usually indicates that something pretty strange is happening.

To see if this was a strange problem particular to us, I asked on our internal channel but no one had offhand seen or heard of this error either. One of my colleagues (who had a working setup on a Mac, the same environment I was using) suggested I set up pyenv to isolate my development environment, which was a good idea but did not seem to solve the problem: both Python 3.8 and 3.9 installed via pyenv ran into the exact same issue.

After flailing around trying a number of other failed approaches (maybe I need to upgrade the version of virtualenv that we’re using?), I broke down and looked harder at the error itself. It seemed to be some kind of typing error in Google’s protobuf library, which google-cloud-bigquery is calling. If this sort of thing was happening to everyone, we probably would have seen it happening more broadly. So my guess, again, was that it was happening due to an obscure interaction between some variable on my machine and this particular combination of dependencies.

At this point, I systematically went through our set of python dependencies to see what might be the matter. For the most part, I found nothing surprising or suspicious. google-api-core was at the latest version, as was google-cloud-bigquery. However, I did notice that the version of protobuf we were using was a little older (3.15.8 when the latest “official” version on pypi was 3.17.3).

It seemed like a longshot that the problem was there, but it seemed like upgrading the dependency was worth a try just in case. So I bumped the version of protobuf to the latest version in my local checkout (pip install protobuf==3.17.3)…

… and sure enough, after doing so, the problem was fixed and ./bqetl --help started working again:

wlach@antwerp bigquery-etl % ./bqetl --help
Usage: bqetl [OPTIONS] COMMAND [ARGS]...

  CLI tools for working with bigquery-etl.

...

After doing so, I did up a quick pull request and the problem is now fixed, at least for me.

It’s a bit unfortunate that dependabot (which we have configured for this repository) didn’t send an update for protobuf, which would have fixed this problem earlier.1 It seems like it’s not completely reliable for python packages, for whatever reason: I have also noticed this problem with mozregression.

I suspect (though can’t confirm) that the problem here is a backwards-incompatible change made to either protobuf or one of the packages that uses it. However, the nature of the incompatibility seems subtle: bigquery-etl works fine with the old set of dependencies we run in continuous integration and it appears to only come up in specific circumstances (i.e. mine). Unfortunately, I need to get back to what I was actually planning to work on and don’t have time to unwind the rather set of complex interactions going on here. Maybe later!

What I would have done differently

This kind of illustrates (again) to me that while some shortcuts and heuristics can save a bunch of time and mental effort (Googling things all the time is basically standard practice in the industry at this point), sometimes you really just need to start a little closer at the problem to find a solution. I was hesitant to do this in this case because I’m never sure where those kinds of rabbit holes are going to take me (e.g. I spent several days debugging a bad interaction between Kubernetes and our airflow cluster in late 2019 with not much to show for the effort), but often all it takes is understanding the general shape of the problem to move you to a quick solution.

Other lessons

Here’s a couple of other things this experience reinforced for me (these are more subjective, take them or leave them):

  1. As an aside, the main reason we use dependabot and aggressively update packages like google-api-core is due to a bug in pip


pydata nyc 2018

Oct 29th, 2018

Data Python

Went to PyData NYC a couple weeks ago, and figured I ought to write up my thoughts for the benefits of the others on my extended team. Why not publish as a blog post while I’m at it?

This is actually the first conference I’d been to in my capacity as a “data engineer” at Mozilla, a team I joined about a year and a half ago after specializing in the same area on the (now-defunct) a-team. I’ve felt a special affinity for the Python community, particularly its data science offshoots (pandas, numpy, and jupyter notebooks) so it was great to finally go to a conference that specializes in these topics.

Overall, the conference was a bit of a mix between people talking about the status of their projects, theoretical talks on specific statistical approaches to data, general talks on how people are doing “data science” (I would say the largest majority of attendees at the conference were users of python data science tools, rather than developers), and case studies of how people are using python data science tools in their research or work. This being New York, many (probably the majority) were using data science tools in fields like quantitative finance, sales, marketing, and health care.

As a side note, it was really satisfying to be able to tell Mozilla’s story about how we collect and use data without violating the privacy of our users. This is becoming more and more of an issue (especailly in Europe with the GPDR) and it really makes me happy that we have a really positive story to tell, not a bunch of dirty secrets that we need to hide.

In general I found the last two types of talks the most rewarding to go to: most of the work I do at Mozilla currently involves larger-scale data where, I’m sad to say, Python is usually not (currently) an applicable tool, at least not by itself (though maybe iodide will help change that! see below). And I don’t usually find a 60 minute talk really enough time for me to be able to properly absorb new mathematical or statistical concepts, though I can sometimes get little tidbits of information from them that come in handy later.

Some talks that made an impression on me:

Of course, this post wouldn’t be complete without a mention of Mike Droettboom’s talk on iodide, a project I’ve been spending some considerable cycles helping with over the last couple of quarters. I need to write some longer thoughts on iodide at some point in the near future, but in a nutshell it’s a scientific notebook environment where the computational kernel lives entirely inside the browser. It was well received and we had a great followup session afterwards with people interested in using it for various things. Being able to show a python environment in the browser which “just works”, with no installation or other steps makes a great tech demo. I’m really excited about the public launch of our server-based environment, which will hopefully be coming in the next couple of months.


PyCon 2015

Apr 23rd, 2015

Mozilla Python

So I went to PyCon 2015. While I didn’t leave quite as inspired as I did in 2014 (when I discovered iPython), it was a great experience and I learned a ton. Once again, I was incredibly impressed with the organization of the conference and the diversity and quality of the speakers.

Since Mozilla was nice enough to sponsor my attendance, I figured I should do another round up of notable talks that I went to.

Technical stuff that was directly relevant to what I work on:

Non-technical stuff:

I probably missed out on a bunch of interesting things. If you also went to PyCon, please feel free to add links to your favorite talks in the comments!


mozregression: New maintainer, issues tracked in bugzilla

May 8th, 2014

Mozilla mozregression Python

Just wanted to give some quick updates on mozregression, your favorite regression-finding tool for Firefox:

  1. I moved all issue tracking in mozregression to bugzilla from github issues. Github unfortunately doesn’t really scale to handle notifications sensibly when you’re part of a large organization like Mozilla, which meant many problems were flying past me unseen. File your new bugs in bugzilla, they’re now much more likely to be acted upon.
  2. Sam Garrett has stepped up to be co-maintainer of the project with me. He’s been doing a great job whacking out a bunch of bugs and keeping things running reliably, and it was time to give him some recognition and power to keep things moving forward.
  3. On that note, I just released mozregression 0.17, which now shows the revision number when running a build (a request from the graphics team, bug 1007238) and handles respins of nightly builds correctly (bug 1000422). Both of these were fixed by Sam.

If you’re interested in contributing to Mozilla and are somewhat familiar with python, mozregression is a great place to start. The codebase is quite approachable and the impact will be high — as I’ve found out over the last few months, people all over the Mozilla organization (managers, developers, QA) use it in the course of their work and it saves tons of their time. A list of currently open bugs is here.


PyCon 2014 impressions: ipython notebook is the future & more

Apr 22nd, 2014

Montreal Mozilla Python

This year’s PyCon US (Python Conference) was in my city of residence (Montréal) so I took the opportunity to go and see what was up in the world of the language I use the most at Mozilla. It was pretty great!

ipython

The highlight for me was learning about the possibilities of ipython notebooks, an absolutely fantastic interactive tool for debugging python in a live browser-based environment. I’d heard about it before, but it wasn’t immediately apparent how it would really improve things — it seemed to be just a less convenient interface to the python console that required me to futz around with my web browser. Watching a few presentations on the topic made me realize how wrong I was. It’s already changed the way I do work with Eideticker data, for the better.

[Using ipython to analyze some eideticker data][3]
Using ipython to analyze some eideticker data

I think the basic premise is really quite simple: a better interface for typing in, experimenting with, and running python code. If you stop and think about it, the modern web interface supports a much richer vocabulary of interactive concepts that the console (or even text editors like emacs): there’s no reason we shouldn’t take advantage of it.

Here are the (IMO) killer features that make it worth using:

To learn more about how to use ipython notebooks for data analysis, I highly recommend Julie Evan’s talk Diving into Open Data with IPython Notebook & Pandas, which you can find on pyvideo.org.

Other Good Talks

I saw some other good talks at the conference, here are some of them:


Catching problems early with python

Oct 15th, 2012

Mozilla Python

Just a few quick notes on how to avoid a class of errors I’ve been seeing in Mozilla’s automation over the last year. Since python interprets code dynamically, it’s pretty easy for problems like undefined variables to slip through, especially if they’re in a codepath that isn’t frequently tested. The most recent example I found was in some cleanup-after-error code for remote mochitest/reftest, which tried to call “self.cleanup” from a standalone method.

def main():
      ...
      try:
        dm.recordLogcat()
        retVal = mochitest.runTests(options)
        logcat = dm.getLogcat()
      except:
        print "TEST-UNEXPECTED-FAIL | %s | Exception caught while running tests." % sys.exc_info()[1]
        mochitest.stopWebServer(options)
        mochitest.stopWebSocketServer(options)
        try:
            self.cleanup(None, options)
        except:
            pass

testing/mochitest/runtestsremote.py

We’re calling cleanup as if it were a class variable, but we’re not inside any class! It’s easy to see what will happen if you try to run some similar code from the python console:

>>> self.cleanup()
Traceback (most recent call last):
  File "&lt;stdin>", line 1, in &lt;module>
NameError: name 'self' is not defined

However, because we’re in a blanket try…except, we will never see an error. The cleanup code will never be called, instead the exception is immediately caught and subsequently ignored. Probably not the end of the world in this case (there are other parts of our mobile automation which will perform the same cleanup later), but it’s easy to imagine where this would be a more serious problem.

There’s two very easy ways to help stop errors like this before they hit our code:

  1. Try to avoid using a blanket try…except. In addition to catching legitimate problems which we want to ignore (in the remote case for example, devicemanager exceptions), it also catches (and thus obscures) things like syntax, name, or type errors. Instead, try just catching the specific exception you’re looking for. For example, we might rewrite the case above as:

try: mochitest.cleanup(None, options) except devicemanager.DMError: print “WARNING: Device error while cleaning up”

2. pyflakes, pyflakes, pyflakes. [Pyflakes][2] is a fantastic tool for analyzing your python code for common problems. It's kind of analagous to [jslint][3], for those of you familiar with that. Here's what happens when we run pyflakes against the offending code: ``` wlach@eideticker:~/src/mozilla-central$ pyflakes testing/mochitest/runtestsremote.py testing/mochitest/runtestsremote.py:7: 'time' imported but unused testing/mochitest/runtestsremote.py:481: undefined name 'self' testing/mochitest/runtestsremote.py:500: undefined name 'self'

I've found pyflakes to be an indispensable part of my workflow. I generally run it after making any substantial change to a python file, and certainly before pushing anything to be consumed by others.

Ultimately there’s no substitute for actually thoroughly testing your code, no matter what language you’re using. But using the right techniques and tools can certainly make your life easier.

[ for those wondering, a fix for the issue mentioned in this post is part of bug 801652 ]