Skip to main content

Posts about linux (old posts, page 4)

python reload: danger, here be dragons

At Mozilla, we use buildbot to coordinate performing builds, unit tests, performance tests, and l10n repacks across all of our build slaves. There is a lot of activity on a project the size of Firefox, which means that the build slaves are kept pretty busy most of the time. Unfortunately, like most software out there, our buildbot code has bugs in it. buildbot provides two ways of picking up new changes to code and configuration: 'buildbot restart' and 'buildbot reconfig'. Restarting buildbot is the cleanest thing to do: it shuts down the existing buildbot process, and starts a new one once the original has shut down cleanly. The problem with restarting is that it interrupts any builds that are currently active. The second option, 'reconfig', is usually a great way to pick up changes to buildbot code without interrupting existing builds. 'reconfig' is implemented by sending SIGHUP to the buildbot process, which triggers a python reload() of certain files. This is where the problem starts. Reloading a module basically re-initializes the module, including redefining any classes that are in the module...which is what you want, right? The whole reason you're reloading is to pick up changes to the code you have in the module! So let's say you have a module, foo.py, with these classes:


class Foo(object):
    def foo(self):
        print "Foo.foo"


class Bar(Foo):
    def foo(self):
        print "Bar.foo"
        Foo.foo(self)
and you're using it like this:

>>> import foo

>>> b = foo.Bar()

>>> b.foo()

Bar.foo

Foo.foo

Looks good! Now, let's do a reload, which is what buildbot does on a 'reconfig':

>>> reload(foo)



>>> b.foo()

Bar.foo

Traceback (most recent call last):
  File "", line 1, in 
  File "/Users/catlee/test/foo.py", line 13, in foo
    Foo.foo(self)
TypeError: unbound method foo() must be called with Foo instance as first argument (got Bar instance instead)

Whoops! What happened? The TypeError exception is complaining that Foo.foo must be called with an instance of Foo as the first argument. (NB: we're calling the unbound method on the class here, not a bound method on the instance, which is why we need to pass in 'self' as the first argument. This is typical when calling your parent class) But wait! Isn't Bar a sub-class of Foo? And why did this work before? Let's try this again, but let's watch what happens to Foo and Bar this time, using the id() function:

>>> import foo

>>> b = foo.Bar()

>>> id(foo.Bar)

3217664

>>> reload(foo)



>>> id(foo.Bar)

3218592

(The id() function returns a unique identifier for objects in python; if two objects have the same id, then they refer to the same object) The id's are different, which means that we get a new Bar class after we reload...I guess that makes sense. Take a look at our b object, which was created before the reload:

>>> b.__class__



>>> id(b.__class__)

3217664

So b is an instance of the old Bar class, not the new one. Let's look deeper:

>>> b.__class__.__bases__

(,)

>>> id(b.__class__.__bases__[0])

3216336

>>> id(foo.Foo)

3218128

A ha! The old Bar's base class (Foo) is different than what's currently defined in the module. After we reloaded the foo module, the Foo class was redefined, which is presumably what we want. The unfortunate side effect of this is that any references by name to the class 'Foo' will pick up the new Foo class, including code in methods of subclasses. There are probably other places where this has unexpected results, but for us, this is the biggest problem. Reloading essentially breaks class inheritance for objects whose lifetime spans the reload. Using super() in the normal way doesn't even work, since you usually refer to your instance's class by name:

class Bar(Foo):
    def foo(self):
        print "Bar.foo"
        super(Bar, self).foo()
If you're using new-style classes, it looks like you can get around this by looking at your __class__ attribute:

class Bar(Foo):
    def foo(self):
        print "Bar.foo"
        super(self.__class__, self).foo()
Buildbot isn't using new-style classes...yet...so we can't use super(). Another workaround I'm playing around with is to use the inspect module to get at the class hierarchy:

def get_parent(obj, n=1):
    import inspect
    return inspect.getmro(obj.__class__)[n]


class Bar(Foo):
    def foo(self):
        print "Bar.foo"
        get_parent(self).foo(self)

Upgrading Wordpress with Mercurial

Since Mozilla has started using Mercurial for source control, I thought I shoud get some hands on experience with it. My Wordpress dashboard has been nagging me to upgrade to the latest version for quite a while now. I was running 2.5.1 up until today, which was released back in April. I've been putting off upgrading because it's always such a pain if you follow the recommended instructions, and I inevitably end up forgetting to migrate some customization I made to the old version. So, to kill two birds with one stone, I decided to try my hand at upgrading Wordpress by using Mercurial to track my changes to the default install, as well as the changes between versions of Wordpress. Preparation: First, start off with a copy of my blog's code in a directory called 'blog'. Download Wordpress 2.5.1 and 2.6.3 (the version I want to upgrade to). Import initial Wordpress code:


tar zxf wordpress-2.5.1.tar.gz # NB: unpacks into wordpress/

mv wordpress wordpress-2.5.1

cd wordpress-2.5.1

hg init

hg commit -A -m 'wordpress 2.5.1'

cd ..

Apply my changes:

hg clone wordpress-2.5.1 wordpress-mine

cd wordpress-mine

hg qnew -m 'my blog' my-blog.patch

hg locate -0 | xargs -0 rm

cp -ar ../blog/* .

hg addremove

hg qrefresh

cd ..

The 'hg locate -0' line removes all the files currently tracked by Mercurial. This is needed so that any files I deleted from my copy of Wordpress also are deleted in my Mercurial repository. The result of these two steps is that I have a repository that has the original Wordpress source code as one revision, with my changes applied as a Mercurial Queue patch. Now I need to tell Mercurial what's changed between versions 2.5.1 and 2.6.3. To do this, I'll make a copy (or clone) of the 2.5.1 repository, and then put all the 2.6.3 files into it. Again, I use 'hg locate -0 | xargs -0 rm' to delete all the files from the old version before copying the new files in. Mercurial is smart enough to notice if files haven't changed, and the subsequent commit with the '-A' flag will add any new files or delete any files that were removed between 2.5.1 and 2.6.3. Upgrade the pristine 2.5.1 to 2.6.3:

hg clone wordpress-2.5.1 wordpress-2.6.3

tar zxf wordpress-2.6.3 # NB: Unpacks into wordpress/

cd wordpress-2.6.3

hg locate -0 | xargs -0 rm

cp -ar ../wordpress/* .

hg commit -A -m 'wordpress-2.6.3'

cd ..

Now I need to perform the actual upgrade to my blog. First I save the state of the current modifications, then pull in the 2.5.1 -> 2.6.3 changes from the wordpress-2.6.3 repository. Then I reapply my changes to the new 2.6.3 code. Pull in 2.6.3 to my blog:

cd wordpress-mine

hg qsave -e -c

hg pull ../wordpress-2.6.3

hg update -C

hg qpush -a -m

Voilà! A quick rsync to my website, and the upgrade is complete! I have to admit, I don't fully grok some of these Mercurial commands. It took a few tries to work out this series of steps, so there's probably a better way of doing it. I'm pretty happy overall though; I managed a successful Wordpress upgrade, and learned something about Mercurial in the process! The next upgrade should go much more smoothly now that I've figured things out a bit better.

Moving on

I'm changing jobs. Yup, after nearly five years at Side Effects, I'm moving on. I have somewhat mixed feelings about this...I'm sad to be leaving such a friendly and talented group of people, but I'm very excited about my next job. I'm very happy to say that I will be joining Mozilla Corporation in their Toronto office starting in October. I'll be working in the Release Engineering group, helping to make sure that the world's thirst for new Firefox builds can be satisfied! I can't say how excited I am about this, it's pretty much a dream job: getting paid to work on a great open source project!

nmudiff is awesome

Man, I wish I had known about this before! nmudiff is a program to email an NMU diff to the Debian Bug Tracking System. I often make quick little changes to debian packages to fix bugs or typos, and it's always been a bit of a pain to generate a patch to send to the maintainer. nmudiff uses debdiff (another very useful command I just learned about) to generate the patch, and email it to the bug tracking system with the appropriate tags.