Skip to main content

Posts about unittests

A year in RelEng

Something prompted me to look at the size of our codebase here in RelEng, and how much it changes over time. This is the code that drives all the build, test and release automation for Firefox, project branches, and Try, as well as configuration management for the various build and test machines that we have. Here are some simple stats: 2,193 changesets across 5 repositories...that's about 6 changes a day on average. We grew from 43,294 lines of code last year to 73,549 lines of code as of today. That's 70% more code today than we had last year. We added 88,154 lines to our code base, and removed 51,957. I'm not sure what this means, but it seems like a pretty high rate of change!

What do you want to know about builds?

Mozilla has been quite involved in recent buildbot development, in particular, helping to make it scale across multiple machines. More on this in another post! Once deployed, these changes will give us the ability to give real time access to various information about our build queue: the list of jobs waiting to start, and which jobs are in progress. This should help other tools like Tinderboxpushlog show more accurate information. One limitation of the upstream work so far is that it only captures a very coarse level of detail about builds: start/end time, and result code is pretty much it. No further detail about the build is captured, like which slave it executed on, what properties it generated (which could include useful information like the URL to the generated binaries), etc. We've also been exporting a json dump of our build status for many months now. It's been useful for some analysis, but it also has limitations: the data is always at least 5 minutes old by the time you look, and in-progress builds are not represented at all. We're starting to look at ways of exporting all this detail in a way that's useful to more people. You want to get notified when your try builds are done? You want to look at which test suites are taking the most time? You want to determine how our build times change over time? You want to find out what the last all-green revision was on trunk? We want to make this data available, so anybody can write these tools.

Just how big is that firehose?

I think we have one of the largest buildbot setups out there and we generate a non-trivial amount of data:
  • 6-10 buildbot master processes generating updates, on different machines in 2 or 3 data centers
  • around 130 jobs per hour composed of 4,773 individual steps total per hour. That works out to about 1.4 updates per second that are generated

How you can help

This is where you come in. I can think of two main classes of interfaces we could set up: a query-type interface where you poll for information that you are interested in, and a notification system where you register a listener for certain types (or all!) events. What would be the best way for us to make this data available to you? Some kind of REST API? A message or event brokering system? pubsubhubbub? Is there some type of data or filtering that would be super helpful to you?

Upcoming changes to unittests

Since May, we've been running our unittest suite twice for each checkin: once for the current reference counting build + test job, and once for the packaged unittests. Current way of running tests Our end goal is to run unittests on our optimized and debug builds. We would stop doing our current reference counting builds, since the debug builds also have reference counting enabled. New way of running tests To get there requires a few intermediate steps:
  • Turn off tests for the current build+tests unittest job. We're hoping to do this soon. The tests run on the packaged builds would still be active at this point (first phase of bug 507540).
  • Turn on tests for debug builds (bug 372581). There's a bit of work left to do here to make sure that hung processes don't kill the buildbot master with multi-gigabyte log files.
  • Turn on tests for optimized builds (bug 486783). This would include nightly and release builds eventually as well, and will give us test coverage on our windows PGO builds.
  • Turn off original reference counting builds completely, along with the tests done on these builds (second phase of bug 507540).
At some point we're also going to be changing how we run mochitests, by splitting the tests up according to which directory they're in, and then running each of the subset of tests in parallel on different machines (bug 452861).

Parallelizing Unit Tests

Last week we flipped the switch and turned on running unit tests on packaged builds for our mozilla-1.9.1, mozilla-central, and tracemonkey branches. What this means is that our current unit test builds are uploaded to a web server along with all their unit tests. Another machine will then download the build and tests, and run various test suites on them. Splitting up the tests this way allows us to run the test suites in parallel, so the mochitest suite will run on one machine, and all the other suites will be run on another machine (this group of tests is creatively named 'everythingelse' on Tinderbox). paralleltests Splitting up the tests is a critical step towards reducing our end-to-end time, which is the total time elapsed between when a change is pushed into one of the source repositories, and when all of the results from that build are available. Up until now, you had to wait for all the test suites to be completed in sequence, which could take over an hour in total. Now that we can split the tests up, the wait time is determined by the longest test suite. The mochitest suite is currently the biggest chunk here, taking somewhere around 35 minutes to complete, and all of the other tests combined take around 20 minutes. One of the next steps for us to do is to look at splitting up the mochitests into smaller pieces. For the time being, we will continue to run the existing unit tests on the same machine that is creating the build. This is so that we can make sure that running tests on the packaged builds is giving us the same results (there are already some known differences: bug 491675, bug 475383) Parallelizing the unit tests, and the infrastructure required to run them, is the first step towards achieving a few important goals. - Reducing end-to-end time. - Running unit tests on debug, as well as on optimized builds. Once we've got both of these going, we can turn off the builds that are currently done solely to be able to run tests on them. - Running unit tests on the same build multiple times, to help isolate intermittent test failures. All of the gory details can be found in bug 383136.