Spread Firefox Affiliate Button

Pages

Getting free diskspace in python, on Windows

Amazingly, one of the most popular links on this site is the quick tip, Getting free diskspace in python.

One of the comments shows that this method doesn’t work on Windows. Here’s a version that does:

import win32file
def freespace(p):
    """
    Returns the number of free bytes on the drive that ``p`` is on
    """
    secsPerClus, bytesPerSec, nFreeClus, totClus = win32file.GetDiskFreeSpace(p)
    return secsPerClus * bytesPerSec * nFreeClus

The win32file module is part of the pywin32 extension module.

Two great, completely unrelated links

Yesterday was a bit of an overwhelming day. After getting home at 1am after a long bus ride home, I was unwinding by catching up on some news and email. I came across these two links, both of which really lifted my mood.

The first, Grokking the Zen of the Vi Wu-Wei, talks about a programmer’s journey from emacs to BBEdit to vim. This post is a great read in and of itself, but what’s really worth it, is the link around the middle of the post to http://stackoverflow.com/questions/1218390/what-is-your-most-productive-shortcut-with-vim/1220118#1220118. This was truly a joy to read. Definitely the best answer I’ve ever seen on Stack Overflow, and quite possibly the best discussion of vi I’ve ever read. It taught me a lot, but I enjoyed reading it for more than that. It was almost like being on a little adventure, discovering all these little hidden secrets about the neighbourhood you’ve been living in for years. Like I said, it was 1am.

The second, The Pope, the judge, the paedophile priest and The New York Times, gave me some reassurance that things aren’t always as they seem as reported by the media. Regardless of how you feel about the Church or the Pope, it seems that journalistic integrity has fallen by the wayside here. From the article:

Fr Thomas Brundage, the former Archdiocese of Milwaukee Judicial Vicar who presided over the canonical criminal case of the Wisconsin child abuser Fr Lawrence Murphy, has broken his silence to give a devastating account of the scandal – and of the behaviour of The New York Times, which resurrected the story.

It looks as if the media were in such a hurry to to blame the Pope for this wretched business that not one news organisation contacted Fr Brundage. As a result, crucial details were unreported.

The entire article is worth a read.

One useful script, a linux version

Johnathan posted links to 3 scripts he finds useful. His sattap script looked handy, so I hacked it up for linux. Run it to do a screen capture, and upload the image to a website you have ssh access into. The link is printed out, and put into the clipboard.

Hope you find this useful!

#!/bin/sh
# sattap - Send a thing to a place
set -e
 
SCP_USER='catlee'
SCP_HOST='people.mozilla.org'
SCP_PATH='~/public_html/sattap/'
 
HTTP_URL="http://people.mozilla.org/~catlee/sattap/"
 
FILENAME=`date | md5sum | head -c 8`.png
FILEPATH=/tmp/$FILENAME
 
echo Capturing...
import $FILEPATH
echo Copying to $SCP_HOST
scp $FILEPATH ${SCP_USER}@${SCP_HOST}:$SCP_PATH
echo Deleting local copy
rm $FILEPATH
 
echo $HTTP_URL$FILENAME | xclip -selection clipboard
echo Your file should be at $HTTP_URL$FILENAME, which is also in your paste buffer

Exporting MQ patches

I’ve been trying to use Mercurial Queues to manage my work on different tasks in several repositories. I try to name all my patches with the name of the bug it’s related to; so for my recent work on getting Talos not skipping builds, I would call my patch ‘bug468731′.

I noticed that I was running this series of steps a lot:
cd ~/mozilla/buildbot-configs
hg qdiff > ~/patches/bug468731-buildbot-configs.patch
cd ~/mozilla/buildbotcustom
hg qdiff > ~/patches/bug468731-buildbotcustom.patch

…and then uploading the resulting patch files as attachments to the bug. There’s a lot of repetition and extra mental work in those steps:

  • I have to type the bug number manually twice. This is annoying, and error-prone. I’ve made a typo on more than one occasion and then wasted a few minutes trying to track down where the file went.
  • I have to type the correct repository name for each patch. Again, I’ve managed to screw this up in the past. Often I have several terminals open, one for each repository, and I can get mixed up as to which repository I’ve currently got active.
  • mercurial already knows the bug number, since I’ve used it in the name of my patch.
  • mercurial already knows which repository I’m in.

I wrote the mercurial extension below to help with this. It will take the current patch name, and the basename of the current repository, and save a patch in ~/patches called [patch_name]-[repo_name].patch. It will also compare the current patch to any previous ones in the patches directory, and save a new file if the patches are different, or tell you that you’ve already saved this patch.

To enable this extension, save the code below somewhere like ~/.hgext/mkpatch.py, and then add “mkpatch = ~/.hgext/mkpatch.py” to your .hgrc’s extensions section. Then you can run ‘hg mkpatch’ to automatically create a patch for you in your ~/patches directory!

import os, hashlib
 
from mercurial import commands, util
from hgext import mq
 
def mkpatch(ui, repo, *pats, **opts):
    """Saves the current patch to a file called <patch_name>-<repo_name>.patch
    in your patch directory (defaults to ~/patches)
    """
    repo_name = os.path.basename(ui.config('paths', 'default'))
    if opts.get('patchdir'):
        patch_dir = opts.get('patchdir')
        del opts['patchdir']
    else:
        patch_dir = os.path.expanduser(ui.config('mkpatch', 'patchdir', "~/patches"))
 
    ui.pushbuffer()
    mq.top(ui, repo)
    patch_name = ui.popbuffer().strip()
 
    if not os.path.exists(patch_dir):
        os.makedirs(patch_dir)
    elif not os.path.isdir(patch_dir):
        raise util.Abort("%s is not a directory" % patch_dir)
 
    ui.pushbuffer()
    mq.diff(ui, repo, *pats, **opts)
    patch_data = ui.popbuffer()
    patch_hash = hashlib.new('sha1', patch_data).digest()
 
    full_name = os.path.join(patch_dir, "%s-%s.patch" % (patch_name, repo_name))
    i = 0
    while os.path.exists(full_name):
        file_hash = hashlib.new('sha1', open(full_name).read()).digest()
        if file_hash == patch_hash:
            ui.status("Patch is identical to ", full_name, "; not saving")
            return
        full_name = os.path.join(patch_dir, "%s-%s.patch.%i" % (patch_name, repo_name, i))
        i += 1
 
    open(full_name, "w").write(patch_data)
    ui.status("Patch saved to ", full_name)
 
mkpatch_options = [
        ("", "patchdir", '', "patch directory"),
        ]
cmdtable = {
    "mkpatch": (mkpatch, mkpatch_options + mq.cmdtable['^qdiff'][1], "hg mkpatch [OPTION]... [FILE]...")
}

ssh on-the-fly port forwarding

Check out this great tip from nion’s blog:

ssh on-the-fly port forwarding.

I’ve often wanted to open up new port forwards, but haven’t wanted to shut down my existing session.

If you follow this by # character (and thus type ~#) you get a list of all forwarded connections.
Using ~C you can open an internal ssh shell that enables you to add and remove local/remote port forwardings

ssh> help
Commands:
-L[bind_address:]port:host:hostport Request local forward
-R[bind_address:]port:host:hostport Request remote forward
-KR[bind_address:]port Cancel remote forward

ssh> -L 8080:localhost:8080

python reload: danger, here be dragons

At Mozilla, we use buildbot to coordinate performing builds, unit tests, performance tests, and l10n repacks across all of our build slaves.

There is a lot of activity on a project the size of Firefox, which means that the build slaves are kept pretty busy most of the time.

Unfortunately, like most software out there, our buildbot code has bugs in it. buildbot provides two ways of picking up new changes to code and configuration: ‘buildbot restart’ and ‘buildbot reconfig’.

Restarting buildbot is the cleanest thing to do: it shuts down the existing buildbot process, and starts a new one once the original has shut down cleanly. The problem with restarting is that it interrupts any builds that are currently active.

The second option, ‘reconfig’, is usually a great way to pick up changes to buildbot code without interrupting existing builds. ‘reconfig’ is implemented by sending SIGHUP to the buildbot process, which triggers a python reload() of certain files.

This is where the problem starts.

Reloading a module basically re-initializes the module, including redefining any classes that are in the module…which is what you want, right? The whole reason you’re reloading is to pick up changes to the code you have in the module!

So let’s say you have a module, foo.py, with these classes:

class Foo(object):
    def foo(self):
        print "Foo.foo"
 
class Bar(Foo):
    def foo(self):
        print "Bar.foo"
        Foo.foo(self)

and you’re using it like this:

>>> import foo
>>> b = foo.Bar()
>>> b.foo()
Bar.foo
Foo.foo

Looks good! Now, let’s do a reload, which is what buildbot does on a ‘reconfig’:

>>> reload(foo)
<module 'foo' from 'foo.pyc'>
>>> b.foo()
Bar.foo
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/catlee/test/foo.py", line 13, in foo
    Foo.foo(self)
TypeError: unbound method foo() must be called with Foo instance as first argument (got Bar instance instead)

Whoops! What happened? The TypeError exception is complaining that Foo.foo must be called with an instance of Foo as the first argument. (NB: we’re calling the unbound method on the class here, not a bound method on the instance, which is why we need to pass in ’self’ as the first argument. This is typical when calling your parent class)

But wait! Isn’t Bar a sub-class of Foo? And why did this work before? Let’s try this again, but let’s watch what happens to Foo and Bar this time, using the id() function:

>>> import foo
>>> b = foo.Bar()
>>> id(foo.Bar)
3217664
>>> reload(foo)
<module 'foo' from 'foo.pyc'>
>>> id(foo.Bar)
3218592

(The id() function returns a unique identifier for objects in python; if two objects have the same id, then they refer to the same object)

The id’s are different, which means that we get a new Bar class after we reload…I guess that makes sense. Take a look at our b object, which was created before the reload:

>>> b.__class__
<class 'foo.Bar'>
>>> id(b.__class__)
3217664

So b is an instance of the old Bar class, not the new one. Let’s look deeper:

>>> b.__class__.__bases__
(<class 'foo.Foo'>,)
>>> id(b.__class__.__bases__[0])
3216336
>>> id(foo.Foo)
3218128

A ha! The old Bar’s base class (Foo) is different than what’s currently defined in the module. After we reloaded the foo module, the Foo class was redefined, which is presumably what we want. The unfortunate side effect of this is that any references by name to the class ‘Foo’ will pick up the new Foo class, including code in methods of subclasses. There are probably other places where this has unexpected results, but for us, this is the biggest problem.

Reloading essentially breaks class inheritance for objects whose lifetime spans the reload. Using super() in the normal way doesn’t even work, since you usually refer to your instance’s class by name:

class Bar(Foo):
    def foo(self):
        print "Bar.foo"
        super(Bar, self).foo()

If you’re using new-style classes, it looks like you can get around this by looking at your __class__ attribute:

class Bar(Foo):
    def foo(self):
        print "Bar.foo"
        super(self.__class__, self).foo()

Buildbot isn’t using new-style classes…yet…so we can’t use super(). Another workaround I’m playing around with is to use the inspect module to get at the class hierarchy:

def get_parent(obj, n=1):
    import inspect
    return inspect.getmro(obj.__class__)[n]
 
class Bar(Foo):
    def foo(self):
        print "Bar.foo"
        get_parent(self).foo(self)

nmudiff is awesome

Man, I wish I had known about this before!

nmudiff is a program to email an NMU diff to the Debian Bug Tracking System.

I often make quick little changes to debian packages to fix bugs or typos, and it’s always been a bit of a pain to generate a patch to send to the maintainer.

nmudiff uses debdiff (another very useful command I just learned about) to generate the patch, and email it to the bug tracking system with the appropriate tags.

Should have done this a long time ago…

In zsh:


alias ':q'=exit

I can’t count the number of times I’ve typed ‘:q’ by mistake in a shell expecting it to quit.

Now it will :)

Getting free diskspace in python

To calculate the amount of free disk space in Python, you can use the os.stafvfs() function.

For some reason, I can never find the docs for os.statvfs() on the first or second try (it’s in the “Files and Directories” section in the os module), and I never remember how it works, so I’m posting this as a note to myself, and maybe to help out anybody else wanting to do the same thing.

A simple free space function can be written as:

import os
def freespace(p):
    """
    Returns the number of free bytes on the drive that ``p`` is on
    """
    s = os.statvfs(p)
    return s.f_bsize * s.f_bavail

I use the f_bavail attribute instead of f_bfree, since the latter includes blocks that are reserved for the the super-user’s use.

I’m not sure, however, on the distinction between f_bsize and f_frsize.

Flash on 64 bit linux in 3 easy steps

  1. aptitude/apt-get/wajig install nspluginwrapper
  2. Download and unpack flash from www.adobe.com, and run linux32 ./flashplayer-installer
  3. nspluginwrapper -i $HOME/.mozilla/plugins/libflashplayer.so