So long Buildbot, and thanks for all the fish

Last week, without a lot of fanfare, we shut off the last of the Buildbot infrastructure here at Mozilla.

Our primary release branches have been switched over to taskcluster for some time now. We needed to keep buildbot running to support the old ESR52 branch. With the release of Firefox 60.2.0esr earlier this month, ESR52 is now officially end-of-life, and therefore so is buildbot here at Mozilla.

Looking back in time, the first commits to our buildbot-configs repository was over 10 years ago on April 27, 2008 by Ben Hearsum: "Basic Mozilla2 configs". Buildbot usage at Mozilla actually predates that by at least two years, Ben was working on some patches in 2006.

Earlier in my career here at Mozilla, I was doing a lot of work with Buildbot, and blogged quite a bit about our experiences with it.

Buildbot served us well, especially in the early days. There really were no other CI systems at the time that could operate at Mozilla's scale.

Unfortunately, as we kept increasing the scale of our CI and release infrastructure, even buildbot started showing some problems. The main architectural limitations of buildbot we encountered were:

  1. Long lived TCP sessions had to stay connected to specific server processes. If the network blipped, or you needed to restart a server, then any jobs running on workers were interrupted.

  2. Its monolithic design meant that small components of the project were hard to develop independently from each other.

  3. The database schema used to implement the job queue became a bottleneck once we started doing hundreds of thousands of jobs a day.

On top of that, our configuration for all the various branches and platforms had grown over the years to a complex set of inheritance rules, defaults, and overrides. Only a few brave souls outside of RelEng managed to effectively make changes to these configs.

Today, much much more of the CI and release configuration lives in tree. This has many benefits including:

  1. Changes are local to the branches they land on. They ride the trains naturally. No need for ugly looooooops.

  2. Developers can self-service most of their own requests. Adding new types of tests, or even changing the compiler are possible without any involvement from RelEng!

Buildbot is dead! Long live taskcluster!