Firefox Download Button

Pages

A year in RelEng

Something prompted me to look at the size of our codebase here in RelEng, and how much it changes over time. This is the code that drives all the build, test and release automation for Firefox, project branches, and Try, as well as configuration management for the various build and test machines that we have.

Here are some simple stats:

2,193 changesets across 5 repositories…that’s about 6 changes a day on average.

We grew from 43,294 lines of code last year to 73,549 lines of code as of today. That’s 70% more code today than we had last year.

We added 88,154 lines to our code base, and removed 51,957. I’m not sure what this means, but it seems like a pretty high rate of change!

Getting free diskspace in python, on Windows

Amazingly, one of the most popular links on this site is the quick tip, Getting free diskspace in python.

One of the comments shows that this method doesn’t work on Windows. Here’s a version that does:

import win32file
def freespace(p):
    """
    Returns the number of free bytes on the drive that ``p`` is on
    """
    secsPerClus, bytesPerSec, nFreeClus, totClus = win32file.GetDiskFreeSpace(p)
    return secsPerClus * bytesPerSec * nFreeClus

The win32file module is part of the pywin32 extension module.

poster 0.6.0 released

I’ve just pushed poster 0.6.0 to the cheeseshop.

Thanks again to everybody who sent in bug reports, and for letting me know how you’re using poster! It’s really great to hear from users.

poster 0.6.0 fixes a few problems with 0.5, most notably:

  • Documentation updates to clarify some common use cases.
  • Added a poster.version attribute. Thanks to JP!
  • Fix for unicode filenames. Thanks to Zed Shaw.
  • Handle StringIO file objects. Thanks to Christophe Combelles.

poster 0.6.0 can be downloaded from the cheeseshop, or from my website. Documentation can be found at http://atlee.ca/software/poster/

What do you want to know about builds?

Mozilla has been quite involved in recent buildbot development, in particular, helping to make it scale across multiple machines. More on this in another post!

Once deployed, these changes will give us the ability to give real time access to various information about our build queue: the list of jobs waiting to start, and which jobs are in progress. This should help other tools like Tinderboxpushlog show more accurate information. One limitation of the upstream work so far is that it only captures a very coarse level of detail about builds: start/end time, and result code is pretty much it. No further detail about the build is captured, like which slave it executed on, what properties it generated (which could include useful information like the URL to the generated binaries), etc.

We’ve also been exporting a json dump of our build status for many months now. It’s been useful for some analysis, but it also has limitations: the data is always at least 5 minutes old by the time you look, and in-progress builds are not represented at all.

We’re starting to look at ways of exporting all this detail in a way that’s useful to more people. You want to get notified when your try builds are done? You want to look at which test suites are taking the most time? You want to determine how our build times change over time? You want to find out what the last all-green revision was on trunk? We want to make this data available, so anybody can write these tools.

Just how big is that firehose?

I think we have one of the largest buildbot setups out there and we generate a non-trivial amount of data:

  • 6-10 buildbot master processes generating updates, on different machines in 2 or 3 data centers
  • around 130 jobs per hour composed of 4,773 individual steps total per hour. That works out to about 1.4 updates per second that are generated

How you can help

This is where you come in.

I can think of two main classes of interfaces we could set up: a query-type interface where you poll for information that you are interested in, and a notification system where you register a listener for certain types (or all!) events.

What would be the best way for us to make this data available to you? Some kind of REST API? A message or event brokering system? pubsubhubbub?

Is there some type of data or filtering that would be super helpful to you?