<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="../assets/xml/rss.xsl" media="all"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>chris' random ramblings (Posts about taskcluster)</title><link>https://atlee.ca/</link><description></description><atom:link href="https://atlee.ca/categories/taskcluster.xml" rel="self" type="application/rss+xml"></atom:link><language>en</language><lastBuildDate>Sat, 22 Feb 2025 20:04:32 GMT</lastBuildDate><generator>Nikola (getnikola.com)</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>Taskcluster migration update: we're finished!</title><link>https://atlee.ca/posts/migration-status-3/</link><dc:creator>chris</dc:creator><description>&lt;h2 id="were-done"&gt;We're done!&lt;/h2&gt;
&lt;p style="text-align:center"&gt;
&lt;img src="https://media.giphy.com/media/26tPo1I4XyWzIBjFe/giphy.gif"&gt;
&lt;/p&gt;

&lt;p&gt;Over the past few weeks we've hit a few major milestones in our project to
migrate all of Firefox's CI and release automation to
&lt;a href="https://docs.taskcluster.net/"&gt;taskcluster&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Firefox 60 and higher are now &lt;strong&gt;100% on taskcluster!&lt;/strong&gt;&lt;/p&gt;
&lt;h3 id="tests"&gt;Tests&lt;/h3&gt;
&lt;p&gt;At the end of March, our Release Operations and Project Integrity teams &lt;a href="https://hg.mozilla.org/mozilla-central/rev/08c54405586b"&gt;finished
migrating&lt;/a&gt; Windows tests onto new hardware machines, all running
taskcluster. That work was later &lt;a href="https://hg.mozilla.org/releases/mozilla-beta/rev/cfe7adda153d"&gt;uplifted to
beta&lt;/a&gt; so
that CI automation on beta would also be completely done using taskcluster.&lt;/p&gt;
&lt;p&gt;This marked the last usage of buildbot for Firefox CI.&lt;/p&gt;
&lt;h3 id="periodic-updates-of-blocklist-and-pinning-data"&gt;Periodic updates of blocklist and pinning data&lt;/h3&gt;
&lt;p&gt;Last week we &lt;a href="https://hg.mozilla.org/mozilla-central/rev/6d0dcc642e1a"&gt;switched
off&lt;/a&gt; the buildbot versions of the periodic update
jobs. These jobs keep the in-tree versions of blocklist, HSTS and HPKP
lists up to date.&lt;/p&gt;
&lt;p&gt;These were the last buildbot jobs running on trunk branches.&lt;/p&gt;
&lt;h3 id="partner-repacks"&gt;Partner repacks&lt;/h3&gt;
&lt;p&gt;And to wrap things up, yesterday the &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1398803"&gt;final patches
landed&lt;/a&gt; to migrate
partner repacks to taskcluster. Firefox 60.0b14 was built yesterday and shipped
today 100% using taskcluster.&lt;/p&gt;
&lt;p&gt;A &lt;strong&gt;massive&lt;/strong&gt; amount of work went into migrating partner repacks from
buildbot to taskcluster, and I'm really proud of the whole team for pulling
this off.&lt;/p&gt;
&lt;p&gt;So, starting today, Firefox 60 and higher will be completely off
taskcluster and not rely on buildbot.&lt;/p&gt;
&lt;p&gt;It feels really good to write that :)&lt;/p&gt;
&lt;p&gt;We've been working on migrating Firefox to taskcluster for over three
years! Code archaeology is hard, but I think the first Firefox jobs to start
running in Taskcluster were the Linux64 builds, done by Morgan in &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1155749"&gt;bug
1155749&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="into-the-glorious-future"&gt;Into the glorious future&lt;/h2&gt;
&lt;p&gt;It's great to have migrated everything off of buildbot and onto
taskcluster, and we have endless ideas for how to improve things now that
we're there. First we need to spend some time cleaning up after ourselves
and paying down some technical debt we've accumulated. It's a good time to
start ripping out buildbot code from the tree as well.&lt;/p&gt;
&lt;p&gt;We've got other plans to make release automation easier for other people to
work with, including doing staging releases on try(!!), making the nightly
release process more similar to the beta/release process, and for exposing
different parts of the release process to release management so that releng
doesn't have to be directly involved with the day-to-day release mechanics.&lt;/p&gt;</description><category>firefox</category><category>mozilla</category><category>releng</category><category>taskcluster</category><guid>https://atlee.ca/posts/migration-status-3/</guid><pubDate>Fri, 20 Apr 2018 16:50:59 GMT</pubDate></item><item><title>Taskcluster migration update, the sequel</title><link>https://atlee.ca/posts/migration-status-2/</link><dc:creator>chris</dc:creator><description>&lt;h3 id="firefox-now-100-buildbot-free"&gt;Firefox, now 100% buildbot-free!&lt;/h3&gt;
&lt;p&gt;First, the good news - &lt;a href="http://archive.mozilla.org/pub/devedition/candidates/60.0b1-candidates/build3/"&gt;Developer Edition
60.0b1&lt;/a&gt;
will be the first release in nearly &lt;strong&gt;10 years&lt;/strong&gt; done without using buildbot.
This is an amazing milestone, and I'm incredibly proud of everybody who has
contributed to make this possible!&lt;/p&gt;
&lt;p style="text-align:center"&gt;
&lt;img src="https://media.giphy.com/media/s98VZT2GHE676/giphy.gif"&gt;
&lt;/p&gt;

&lt;h3 id="long-time-no-update"&gt;Long time, no update&lt;/h3&gt;
&lt;p&gt;How did we get here?  It's been, uh, almost 6 months since I &lt;a href="https://atlee.ca/posts/migration-status/"&gt;last posted
&lt;/a&gt; an update about our migration to Taskcluster.&lt;/p&gt;
&lt;p&gt;In my last update, I described our plans for the end of 2017...&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="nx"&gt;We&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="nx"&gt;re&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;on&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;track&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;ship&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;builds&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;produced&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Taskcluster&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;part&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;the&lt;/span&gt;
&lt;span class="m m-Double"&gt;56.0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;release&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;scheduled&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;late&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;September&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;After&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;that&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;only&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Firefox&lt;/span&gt;
&lt;span class="nx"&gt;builds&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;being&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;produced&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;by&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;buildbot&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;will&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;be&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;ESR52&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="nx"&gt;Meanwhile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;we&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="nx"&gt;ve&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;started&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;tackling&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;remaining&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;parts&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;release&lt;/span&gt;
&lt;span class="nx"&gt;automation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;We&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;prioritized&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;getting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;nightly&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;CI&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;builds&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;migrated&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;to&lt;/span&gt;
&lt;span class="nx"&gt;Taskcluster&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;however&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;there&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;are&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;still&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;parts&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;release&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;
&lt;span class="nx"&gt;still&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;implemented&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Buildbot&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="nx"&gt;We&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="nx"&gt;re&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;aiming&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;have&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;release&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;automation&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;completely&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;migrated&lt;/span&gt;
&lt;span class="nx"&gt;off&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;buildbot&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;by&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;end&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;year&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;We&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="nx"&gt;ve&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;already&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;seen&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;many&lt;/span&gt;
&lt;span class="nx"&gt;benefits&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;migrating&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;CI&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Taskcluster&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;migrating&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;release&lt;/span&gt;
&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;will&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;realize&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;many&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;those&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;same&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;benefits&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h3 id="howd-we-do"&gt;How'd we do?&lt;/h3&gt;
&lt;p&gt;&lt;img src="https://media.giphy.com/media/3oxHQmsEZQYrnVKGCA/giphy.gif" style="float:right"&gt;
We're past the end of 2017, so how are we doing?&lt;/p&gt;
&lt;p&gt;Well, we successfully shipped &lt;a href="https://www.mozilla.org/en-US/firefox/56.0/releasenotes/"&gt;56.0&lt;/a&gt; with builds produced in Taskcluster. Our big
&lt;a href="https://blog.mozilla.org/blog/2017/11/14/introducing-firefox-quantum/"&gt;Firefox Quantum release (57.0)&lt;/a&gt;, was
also shipped with builds produced by Taskcluster.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;(side note: 57 had the most complex update scenarios we've ever had to support for
Firefox...a subject for another post!)&lt;/em&gt;&lt;/p&gt;
&lt;h3 id="release-scheduling"&gt;Release scheduling&lt;/h3&gt;
&lt;p&gt;Post-56.0, our release process was using Taskcluster exclusively for producing
the initial builds, and all the release process scheduling. We were still using
Buildbot for many of the post-build tasks, like l10n repacks, publishing
updates, pushing files to S3, etc.  Once again we relied on the &lt;a href="https://wiki.mozilla.org/ReleaseEngineering/Applications/BuildbotBridge&amp;gt;"&gt;buildbot
bridge&lt;/a&gt;
to allow us to integrate existing buildbot components with the newer
taskcluster pipeline. I learned from &lt;a href="https://kimmoir.blog/"&gt;Kim Moir&lt;/a&gt; that
this is a great example of the &lt;a href="https://www.slideshare.net/k2moir/from-hello-world-to-goodbye-code-86090379#23&amp;gt;"&gt;strangler
pattern&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In the fall of 2017, we decided to begin migrating all of the scheduling logic
for release automation into taskcluster using the &lt;a href="https://firefox-source-docs.mozilla.org/taskcluster/taskcluster/index.html"&gt;in-tree
taskgraph&lt;/a&gt;
scheduling system. We did this for a few reasons...&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Having the release scheduling logic ride the trains is much more
   maintainable. Previous to this we had an externally defined release pipeline
   in our &lt;a href="https://github.com/mozilla-releng/releasetasks"&gt;releasetasks repo&lt;/a&gt;. It
   was hard to keep this repository in sync with changes required for beta/release
   and ESR branches.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;More importantly, having the release scheduling logic in-tree meant that we could then
   rely on &lt;a href="http://scriptworker.readthedocs.io/en/latest/chain_of_trust.html"&gt;chain-of-trust&lt;/a&gt;
   to verify artifacts produced by the release pipeline.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;We felt that having the complete release pipeline defined in taskcluster would make it
   easier for us to tackle the remaining buildbot bridge tasks in parallel.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;We hit this milestone in the 58 cycle. Starting with 58.0b3, Firefox and Fennec releases
were completely scheduled using the in-tree taskgraph generation. We also migrated over the
l10n repacks at the same time, removing a longstanding source of problems where repacks
would fail when we first got to beta due to environmental differences between taskcluster
and buildbot.&lt;/p&gt;
&lt;h3 id="no-bbb-releases"&gt;No-BBB Releases&lt;/h3&gt;
&lt;p&gt;Still, as of 58, much of release automation still ran on buildbot, even if
Taskcluster was doing all the scheduling.&lt;/p&gt;
&lt;p&gt;Since December, we've been working on removing these last few pieces of buildbot from the
release process. Progress was initially a bit slow, given
&lt;a href="https://flic.kr/p/DjgGn6"&gt;Austin&lt;/a&gt; and Christmas, but we've been hard at work
in the new year.&lt;/p&gt;
&lt;p&gt;That brings us to today.&lt;/p&gt;
&lt;p&gt;We've moved &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1398796"&gt;uptake monitoring&lt;/a&gt;,
&lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1398793"&gt;update verify&lt;/a&gt; (and made it 2x
faster too!), &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1332341"&gt;update submission&lt;/a&gt;,
&lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1415981"&gt;final verify&lt;/a&gt;, &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1433459"&gt;bouncer
submission&lt;/a&gt;, &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1398787"&gt;version bumping and
tagging&lt;/a&gt;, &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1438735"&gt;balrog submission&lt;/a&gt; all to run in Taskcluster via various kinds of scriptworkers.&lt;/p&gt;
&lt;p&gt;As I mentioned above, &lt;a href="https://www.mozilla.org/en-US/firefox/developer/"&gt;DevEdition&lt;/a&gt;
60.0b1 will be the first release in nearly &lt;strong&gt;10 years&lt;/strong&gt; done without using buildbot. The
rest of the 60 release cycle will follow suit, and once 60 hits the release channel, only
ESR52 will remain on buildbot!&lt;/p&gt;</description><category>mozilla</category><category>releng</category><category>taskcluster</category><guid>https://atlee.ca/posts/migration-status-2/</guid><pubDate>Sat, 03 Mar 2018 12:26:55 GMT</pubDate></item><item><title>Taskcluster migration update</title><link>https://atlee.ca/posts/migration-status/</link><dc:creator>chris</dc:creator><description>&lt;section id="all-your-nightlies-are-belong-to-taskcluster"&gt;
&lt;h2&gt;All your nightlies are belong to Taskcluster&lt;/h2&gt;
&lt;p&gt;&lt;a class="reference external" href="https://atlee.ca/posts/nightly-builds-from-taskcluster/"&gt;In January&lt;/a&gt; I announced that we
had just migrated Linux nightly builds to &lt;a class="reference external" href="https://docs.taskcluster.net/"&gt;Taskcluster&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We completed a huge milestone in July: starting in Firefox 56, we've been
doing &lt;strong&gt;all&lt;/strong&gt; our nightly Firefox builds in Taskcluster.&lt;/p&gt;
&lt;a class="reference external image-reference" href="https://giphy.com/gifs/excited-applause-minions-MOWPkhRAUbR7i?utm_source=media-link&amp;amp;utm_medium=landing&amp;amp;utm_campaign=Media%20Links&amp;amp;utm_term="&gt;&lt;img alt="https://media.giphy.com/media/MOWPkhRAUbR7i/giphy.gif" src="https://media.giphy.com/media/MOWPkhRAUbR7i/giphy.gif"&gt;&lt;/a&gt;
&lt;p&gt;This includes all &lt;a class="reference external" href="https://bugzilla.mozilla.org/show_bug.cgi?id=1267427"&gt;Windows&lt;/a&gt;, &lt;a class="reference external" href="https://bugzilla.mozilla.org/show_bug.cgi?id=1267425"&gt;macOS&lt;/a&gt;, Linux, and Android builds. You can
see all the builds and repacks on &lt;a class="reference external" href="https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&amp;amp;filter-searchStr=nightly"&gt;Treeherder&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In August, after 56 merged to Beta, we've also been doing our Firefox
Beta builds using Taskcluster. We're on track to be shipping Firefox 56, built from Taskcluster to release users at the end of September.&lt;/p&gt;
&lt;p&gt;Windows and macOS each had their own challenges to get them ready to
build and ship to our nightly users.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="windows-signing"&gt;
&lt;h2&gt;Windows signing&lt;/h2&gt;
&lt;p&gt;We've had Windows builds running in Taskcluster for quite a while now.
The biggest missing piece stopping us from shipping these builds was
&lt;a class="reference external" href="https://bugzilla.mozilla.org/show_bug.cgi?id=1277591"&gt;signing&lt;/a&gt;.
Windows builds end up being a bit complicated to sign.&lt;/p&gt;
&lt;p&gt;First, each compiled .exe and .dll binary needs to be signed.
Signing binaries in windows changes their contents, and so we need to
regenerate some files that depend on the exact contents of binaries.
Next, we need to create packages in various formats: a "setup.exe" for
installing Firefox, and also &lt;a class="reference external" href="https://wiki.mozilla.org/Software_Update:MAR"&gt;MAR&lt;/a&gt; files for updates.
Each of these package formats in turn need to be signed.&lt;/p&gt;
&lt;p&gt;In buildbot, this process was monolithic. All of the binary
generation and signing happened as part of the same build process. The
same process would also publish symbols to the &lt;a class="reference external" href="https://developer.mozilla.org/en-US/docs/Mozilla/Using_the_Mozilla_symbol_server"&gt;symbol server&lt;/a&gt; and
publish updates to &lt;a class="reference external" href="https://github.com/mozilla/balrog"&gt;Balrog&lt;/a&gt; The downside of this monolithic process
is that it adds additional dependencies to the build, which is already
a really long process. If something goes wrong with signing, or
publishing updates, you don't want to have to restart a 2 hour build!&lt;/p&gt;
&lt;p&gt;As part of our migration to Taskcluster, we decided that builds should
minimize their external dependencies. This means that the build task
produces only unsigned binaries, and it is the responsibility of
downstream tasks to sign them. We also wanted discrete tasks for
symbol and update submission.&lt;/p&gt;
&lt;p&gt;One wrinkle in this approach is that the logic that defines how to
create a setup.exe package or a MAR file lives &lt;a class="reference external" href="https://dxr.mozilla.org/mozilla-central/source/toolkit/mozapps/installer/packager.mk"&gt;in tree&lt;/a&gt;. We didn't
want to run that code in the same context as the code that generates
signatures.&lt;/p&gt;
&lt;p&gt;Our solution to this was to create a sequence of build -&amp;gt;
signing -&amp;gt; repackage -&amp;gt; signing tasks. The signing tasks run in a
restricted environment while the build and repackage tasks have access to
the build system in order to produce the required artifacts. Using the
&lt;a class="reference external" href="http://escapewindow.dreamwidth.org/249409.html"&gt;chain of trust&lt;/a&gt;, we can demonstrate that the artifacts weren't
tampered with between intermediate tasks.&lt;/p&gt;
&lt;p&gt;Finally, we need to consider &lt;a class="reference external" href="https://wiki.mozilla.org/L10n:Home_Page"&gt;l10n&lt;/a&gt; repacks. We ship Firefox in over 90
&lt;a class="reference external" href="https://hg.mozilla.org/mozilla-central/file/tip/browser/locales/all-locales"&gt;locales&lt;/a&gt;. The repacking process downloads the en-US build and replaces
the English strings with localized strings. Each of these repacks
needs to be based on the signed en-US build. Each will also generate
its own setup.exe and complete MAR for updates.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="macos-performance-and-why-your-build-directory-matters"&gt;
&lt;h2&gt;macOS performance (and why your build directory matters)&lt;/h2&gt;
&lt;p&gt;Like Windows, we've had macOS builds running on Taskcluster for a long
time. Also like Windows, we had to solve &lt;a class="reference external" href="https://bugzilla.mozilla.org/show_bug.cgi?id=1376550"&gt;signing for macOS&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;However, the biggest blocker for the macOS build migration, was a
&lt;a class="reference external" href="https://bugzilla.mozilla.org/show_bug.cgi?id=1338651"&gt;performance bug&lt;/a&gt;. Builds
produced on Taskcluster showed some serious performance regressions as
compared to the builds produced on buildbot.&lt;/p&gt;
&lt;p&gt;Many very smart people looked at this bug since it was first
discovered in February. They compared library versions being used.
They compared compiler versions and compiler flags. They even
inspected the generated assembly code from both systems.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="http://gittup.org/blog/"&gt;Mike Shal&lt;/a&gt; stumbled across the first clue to what was going on in
&lt;a class="reference external" href="https://bugzilla.mozilla.org/show_bug.cgi?id=1338651#c111"&gt;June&lt;/a&gt;:
if he stripped the Taskcluster binaries, then the performance problems
disappeared! At this point we decided that we could go ahead and ship
these builds to nightly users, knowing that the performance regression
would disappear on beta and release.&lt;/p&gt;
&lt;p&gt;Later on, Mike realized that it's not the presence or absence of
symbols in the binary that cause the performance hit, it's &lt;em&gt;what
directory the builds are done in.&lt;/em&gt; On buildbot we build under
/builds/..., and on Taskcluster we build under /home/...&lt;/p&gt;
&lt;a class="reference external image-reference" href="https://giphy.com/gifs/confused-huh-mark-wahlberg-zjQrmdlR9ZCM?utm_source=media-link&amp;amp;utm_medium=landing&amp;amp;utm_campaign=Media%20Links&amp;amp;utm_term="&gt;&lt;img alt="https://media.giphy.com/media/zjQrmdlR9ZCM/giphy.gif" src="https://media.giphy.com/media/zjQrmdlR9ZCM/giphy.gif"&gt;&lt;/a&gt;
&lt;p&gt;Read the &lt;a class="reference external" href="https://bugzilla.mozilla.org/show_bug.cgi?id=1338651"&gt;bug&lt;/a&gt; for more gory details. This is definitely one of the
strangest bugs I've seen.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="lessons-learned"&gt;
&lt;h2&gt;Lessons learned&lt;/h2&gt;
&lt;p&gt;We learned quite a bit in the process of migrating Windows and macOS
nightly builds to Taskcluster.&lt;/p&gt;
&lt;p&gt;First, we gained a huge amount of experience with the &lt;a class="reference external" href="https://firefox-source-docs.mozilla.org/taskcluster/taskcluster/index.html"&gt;in-tree scheduling system&lt;/a&gt;.
There's a bit of a learning curve to climb, but it's an
extremely powerful and flexible system. Many kudos to &lt;a class="reference external" href="http://code.v.igoro.us/"&gt;Dustin&lt;/a&gt; for his work creating the foundation of
this system here. His blog post, "&lt;a class="reference external" href="http://code.v.igoro.us/posts/2016/08/whats-so-special-about-in-tree.html"&gt;What's So Special About "In-Tree"?&lt;/a&gt;",
is a great explanation of why having this code as part of Firefox's
repository is so important.&lt;/p&gt;
&lt;p&gt;One of the killer features of having all the scheduling logic live
in-tree is that you can do quite a bit of work locally, without
requiring any build infrastructure. This is extremely useful when
working on the complex build / signing / repackage sequence of tasks
described above. You can make your changes, generate a new task graph,
and inspect the results.&lt;/p&gt;
&lt;p&gt;Once you're happy with your local changes, you can push them to try to
validate your local testing, get your patch reviewed, and then finally
landed in gecko. Your scheduling changes will take effect as soon as
they land into the repo. This made it possible for us to do a lot of
testing on another project branch, and then merge the code to central
once we were ready.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="what-s-next"&gt;
&lt;h2&gt;What's next?&lt;/h2&gt;
&lt;p&gt;We're on track to ship builds produced in Taskcluster as part of the
56.0 release scheduled for late September. After that the only Firefox
builds being produced by buildbot will be for ESR52.&lt;/p&gt;
&lt;p&gt;Meanwhile, we've started tackling the remaining parts of release
automation. We prioritized getting nightly and CI builds migrated to
Taskcluster, however, there are still parts of the release process
still implemented in Buildbot.&lt;/p&gt;
&lt;p&gt;We're aiming to have release automation completely migrated
off of buildbot by the end of the year. We've already seen many
benefits from migrating CI to Taskcluster, and migrating the release
process will realize many of those same benefits.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="thanks"&gt;
&lt;h2&gt;Thanks!&lt;/h2&gt;
&lt;p&gt;Thank you for reading this far!&lt;/p&gt;
&lt;p&gt;Members from the Release Engineering, Release Operations, Taskcluster,
Build, and Product Integrity teams all were involved in finishing up
this migration. Thanks to everyone involved (there are a lot of you!)
to getting us across the finish line here.&lt;/p&gt;
&lt;p&gt;In particular, if you come across one of these fine individuals at the
office, or maybe on IRC, I'm sure they would appreciate a quick
"thank you":&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Aki Sasaki&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Dustin Mitchell&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Greg Arndt&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Joel Maher&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Johan Lorenzo&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Justin Wood&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Kim Moir&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Mihai Tabara&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Mike Shal&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Nick Thomas&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Rail Aliiev&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Rob Thijssen&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Simon Fraser&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Wander Costa&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;</description><category>mozilla</category><category>releng</category><category>taskcluster</category><guid>https://atlee.ca/posts/migration-status/</guid><pubDate>Wed, 30 Aug 2017 09:40:39 GMT</pubDate></item><item><title>Nightly builds from Taskcluster</title><link>https://atlee.ca/posts/nightly-builds-from-taskcluster/</link><dc:creator>chris</dc:creator><description>&lt;p&gt;Yesterday, for the very first time, we started shipping Linux Desktop and
Android Firefox nightly builds from &lt;a class="reference external" href="https://docs.taskcluster.net/"&gt;Taskcluster&lt;/a&gt;.&lt;/p&gt;
&lt;img alt="74851712.jpg" src="https://atlee.ca/posts/nightly-builds-from-taskcluster/74851712.jpg"&gt;
&lt;p&gt;We now have a much more secure, resilient, and hackable nightly
build and release process.&lt;/p&gt;
&lt;p&gt;It's more secure, because we have developed a chain of trust that allows
us to verify all generated artifacts back to the original decision task
and docker image. Signing is no longer done as part of the build process,
but is now split out into a discrete task after the build completes.&lt;/p&gt;
&lt;p&gt;The new process is more resilient because we've split up the monolithic
build process into smaller bits: build, signing, symbol upload, upload to
CDN, and publishing updates are all done as separate tasks. If any one of
these fail, they can be retried independently. We don't have to re-compile
the entire build again just because an external service was temporarily
unavailable.&lt;/p&gt;
&lt;p&gt;Finally, it's more hackable - in a good way! All the configuration files
for the nightly build and release process are contained in-tree. That
means it's easier to inspect and change how nightly builds are done.
Changes will automatically ride the trains to aurora, beta, etc.&lt;/p&gt;
&lt;p&gt;Ideally you didn't even notice this change! We try and get these changes
done quietly, smoothly, in the background.&lt;/p&gt;
&lt;p&gt;This is a giant milestone for Mozilla's &lt;a class="reference external" href="https://wiki.mozilla.org/ReleaseEngineering"&gt;Release Engineering&lt;/a&gt; and Taskcluster teams, and
is the result of many months of hard work, planning, coding, reviewing and
debugging.&lt;/p&gt;
&lt;p&gt;Big big thanks to jlund, Callek, mtabara, kmoir, aki, dustin, sfraser, jlorenzo,
coop, jmaher, bstack, gbrown, and everybody else who made this possible!&lt;/p&gt;</description><category>mozilla</category><category>releng</category><category>taskcluster</category><guid>https://atlee.ca/posts/nightly-builds-from-taskcluster/</guid><pubDate>Fri, 20 Jan 2017 13:35:25 GMT</pubDate></item><item><title>RelEng Retrospective - Q1 2015</title><link>https://atlee.ca/posts/releng-retrospective-q1-2015/</link><dc:creator>chris</dc:creator><description>&lt;p&gt;&lt;a href="https://wiki.mozilla.org/ReleaseEngineering"&gt;RelEng&lt;/a&gt; had a great start to
2015. We hit some major milestones on projects like Balrog and were able to turn
off some old legacy systems, which is always an extremely satisfying thing to do!&lt;/p&gt;
&lt;p&gt;We also made some exciting new changes to the underlying infrastructure, got
some projects off the drawing board and into production, and drastically
reduced our test load!&lt;/p&gt;
&lt;h2 id="firefox-updates"&gt;Firefox updates&lt;/h2&gt;
&lt;h3 id="balrog"&gt;&lt;a href="https://wiki.mozilla.org/Balrog"&gt;Balrog&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://werewolfnightmare101.deviantart.com/art/Balrog-Drawing-254795366"&gt;&lt;img alt="balrog" src="https://atlee.ca/posts/releng-retrospective-q1-2015/balrog.png"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;All Firefox update queries are now being served by Balrog!  Earlier this year,
we switched all Firefox update queries off of the old update server,
aus3.mozilla.org, to the new update server, codenamed
&lt;a href="https://wiki.mozilla.org/Balrog"&gt;Balrog&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Already, Balrog has enabled us to be much more flexible in handling updates
than the previous system. As an example, in &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1150021"&gt;bug
1150021&lt;/a&gt;, the About
Firefox dialog was broken in the Beta version of Firefox 38 for users with RTL
locales. Once the problem was discovered, we were able to quickly disable
updates just for those users until a fix was ready. With the previous system it
would have taken many hours of specialized manual work to disable the updates
for just these locales, and to make sure they didn't get updates for subsequent
Betas.&lt;/p&gt;
&lt;p&gt;Once we were confident that Balrog was able to handle all previous traffic, we
&lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1117962"&gt;shut down the old update server (aus3)&lt;/a&gt;.
aus3 was also one of the last systems relying on CVS (!! I know, rite?). It's a
great feeling to be one step closer to axing one more old system!&lt;/p&gt;
&lt;h3 id="funsize"&gt;&lt;a href="https://wiki.mozilla.org/ReleaseEngineering/Funsize"&gt;Funsize&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;When we started the quarter, we had an exciting new plan for generating partial
updates for Firefox in a scalable way.&lt;/p&gt;
&lt;p&gt;Then we threw out that plan and came up with an EVEN MOAR BETTER plan!&lt;/p&gt;
&lt;p&gt;The &lt;a href="http://rail.merail.ca/posts/taskcluster-first-impression.html"&gt;new architecture&lt;/a&gt;
for funsize relies on &lt;a href="https://pulse.mozilla.org/"&gt;Pulse&lt;/a&gt; for notifications
about new nightly builds
that need partial updates, and uses &lt;a href="http://docs.taskcluster.net/"&gt;TaskCluster&lt;/a&gt;
for doing the generation of the partials and publishing to Balrog.&lt;/p&gt;
&lt;p&gt;The current status of funsize is that we're using it to &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1118015"&gt;generate partial
updates for nightly builds&lt;/a&gt;,
but not published to the regular nightly update channel yet.&lt;/p&gt;
&lt;p&gt;There's lots more to say here...stay tuned!&lt;/p&gt;
&lt;h2 id="ftp-s3"&gt;FTP &amp;amp; S3&lt;/h2&gt;
&lt;p&gt;Brace yourselves... &lt;a href="http://ftp.mozilla.org/pub/mozilla.org/"&gt;ftp.mozilla.org&lt;/a&gt;
is going away...&lt;/p&gt;
&lt;p&gt;&lt;img alt="brace yourselves...ftp is going away" src="https://atlee.ca/posts/releng-retrospective-q1-2015/61319299.jpg"&gt;&lt;/p&gt;
&lt;p&gt;...in its current incarnation at least.&lt;/p&gt;
&lt;p&gt;Expect to hear MUCH more about this in the coming months.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;tl;dr&lt;/strong&gt; is that we're migrating as much of the Firefox build/test/release
automation to S3 as possible.&lt;/p&gt;
&lt;p&gt;The existing machinery behind ftp.mozilla.org will be going away near the end of Q3. We
have some ideas of how we're going to handle migrating existing content, as
well as handling new content. You should expect that you'll still be able to
access nightly and CI Firefox builds, but you may need to adjust your scripts
or links to do so.&lt;/p&gt;
&lt;p&gt;Currently we have most &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1100624"&gt;builds&lt;/a&gt;
and &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1117960"&gt;tests&lt;/a&gt;
doing their transfers to/from S3 via the &lt;a href="https://tools.taskcluster.net/index/artifacts/#/"&gt;task cluster index&lt;/a&gt; in
addition to doing parallel uploads to ftp.mozilla.org. We're aiming to shut off
most uploads to ftp this quarter.&lt;/p&gt;
&lt;p&gt;Please let us know if you have particular systems or use cases that rely on the
current host or directory structure!&lt;/p&gt;
&lt;h2 id="release-build-promotion"&gt;Release build promotion&lt;/h2&gt;
&lt;p&gt;Our &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1118794"&gt;new Firefox release
pipeline&lt;/a&gt; got off the
drawing board, and the initial proof-of-concept work is done.&lt;/p&gt;
&lt;p&gt;The main idea here is to take an existing build based on a push to
mozilla-beta, and to "promote" it to a release build. So we need to generate
all the l10n repacks, partner repacks, generate partial updates, publish files
to CDNs, etc.&lt;/p&gt;
&lt;p&gt;The big win here is that it cuts our time-to-release nearly in half, and also
simplifies our codebase quite a bit!&lt;/p&gt;
&lt;p&gt;Again, expect to hear more about this in the coming months.&lt;/p&gt;
&lt;h2 id="infrastructure"&gt;Infrastructure&lt;/h2&gt;
&lt;p&gt;In addition to all those projects in development, we also tackled quite a few
important infrastructure projects.&lt;/p&gt;
&lt;h3 id="osx-test-platform"&gt;OSX test platform&lt;/h3&gt;
&lt;p&gt;10.10 is now the most widely used Mac platform for Firefox, and it's important
to test what our users are running. We &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1118183"&gt;performed a rolling upgrade&lt;/a&gt;
of our OS X testing environment, migrating from 10.8 to 10.10 while spending
nearly zero capital, and with no downtime. We worked jointly with the Sheriffs
and A-Team to green up all the tests, and shut coverage off on the old platform
as we brought it up on the new one. We have a few 10.8 machines left riding the
trains that will join our 10.10 pool with the release of ESR 38.1.&lt;/p&gt;
&lt;h3 id="got-windows-builds-in-aws"&gt;Got Windows builds in AWS&lt;/h3&gt;
&lt;p&gt;We saw the first successful builds of Firefox for &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1124303"&gt;Windows in
AWS&lt;/a&gt;
this quarter as well! This paves the way for greater flexibility, on-demand
burst capacity, faster developer prototyping, and disaster recovery and
resiliency for windows Firefox builds. We'll be working on making these
virtualized instances more performant and being able to do large-scale
automation before we roll them out into production.&lt;/p&gt;
&lt;h3 id="puppet-on-windows"&gt;Puppet on windows&lt;/h3&gt;
&lt;p&gt;RelEng uses &lt;a href="https://puppetlabs.com/"&gt;puppet&lt;/a&gt; to manage our Linux and OS X
infrastructure. Presently, we use a very different tool chain, Active Directory
and Group Policy Object, to manage our Windows infrastructure. This quarter we
deployed a prototype Windows build machine which is &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1121023"&gt;managed with puppet&lt;/a&gt;
instead. Our goal here is to increase visibility and hackability of our Windows
infrastructure. A common deployment tool will also make it easier for RelEng
and community to deploy new tools to our Windows machines.&lt;/p&gt;
&lt;h3 id="new-tooltool-features"&gt;New Tooltool Features&lt;/h3&gt;
&lt;p&gt;We've &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1133842"&gt;redesigned and
deployed&lt;/a&gt; a new version
of &lt;a href="http://code.v.igoro.us/posts/2015/04/tooltool-uploads.html"&gt;tooltool&lt;/a&gt;, the
content-addressable store for large binary files used in build and test jobs.
Tooltool is now integrated with RelengAPI and uses S3 as a backing store. This
gives us scalability and a more flexible permissioning model that, in addition
to serving public files, will allow the same access outside the releng network
as inside.  That means that developers as well as external automation like
TaskCluster can use the service just like Buildbot jobs.  The new
implementation also boasts a much simpler HTTP-based upload mechanism that will
enable easier use of the service.&lt;/p&gt;
&lt;h3 id="centralized-posix-system-logging"&gt;Centralized POSIX System Logging&lt;/h3&gt;
&lt;p&gt;Using syslogd/rsyslogd and &lt;a href="https://papertrailapp.com/"&gt;Papertrail&lt;/a&gt;, we've set
up centralized system logging for all our POSIX infrastructure. Now that all
our system logs are going to one location and we can see trends across multiple
machines, we've been able to quickly identify and fix a number of previously
hard-to-discover bugs. We're planning on adding additional logs (like Windows
system logs) so we can do even greater correlation. We're also in the process
of adding more automated detection and notification of some easily recognizable
problems.&lt;/p&gt;
&lt;h3 id="security-work"&gt;Security work&lt;/h3&gt;
&lt;p&gt;Q1 included some significant effort to avoid serious security exploits like
GHOST, escalation of privilege bugs in the Linux kernel, etc. We manage 14
different operating systems, some of which are fairly esoteric and/or no longer
supported by the vendor, and we worked to backport some code and patches to
some platforms while upgrading others entirely. Because of the way our
infrastructure is architected, we were able to do this with minimal downtime or
impact to developers.&lt;/p&gt;
&lt;h3 id="api-to-manage-aws-workers"&gt;API to manage AWS workers&lt;/h3&gt;
&lt;p&gt;As part of our ongoing effort to &lt;a href="https://bugzil.la/965691"&gt;automate the loaning of releng
machines&lt;/a&gt; when required, we created an API layer to
facilitate the creation and loan of AWS resources, which was previously, and
perhaps ironically, one of the bigger time-sinks for buildduty when loaning
machines.&lt;/p&gt;
&lt;h3 id="cross-platform-worker-for-task-cluster"&gt;Cross-platform worker for task cluster&lt;/h3&gt;
&lt;p&gt;Release engineering is in the process of migrating from our stalwart,
buildbot-driven infrastructure, to a newer, more purpose-built solution in 
&lt;a href="http://docs.taskcluster.net/"&gt;taskcluster&lt;/a&gt;. Many FirefoxOS jobs have
already migrated, but those all conveniently run on Linux. In order to support
the entire range of release engineering jobs, we need support for Mac and
Windows as well. In Q1, we created what we call a "generic worker," essentially
a base class that allows us to extend taskcluster job support to non-Linux
operating systems.&lt;/p&gt;
&lt;h2 id="testing"&gt;Testing&lt;/h2&gt;
&lt;p&gt;Last, but not least, we &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1131269"&gt;deployed initial support&lt;/a&gt; for
&lt;a href="https://elvis314.wordpress.com/2015/02/06/seta-search-for-extraneous-test-automation/"&gt;SETA&lt;/a&gt;,
the search for extraneous test automation!&lt;/p&gt;
&lt;p&gt;This means we've stopped running all tests on all builds. Instead, we use
historical data to determine which tests to run that have been catching the
most regressions. Other tests are run less frequently.&lt;/p&gt;</description><category>aws</category><category>balrog</category><category>firefox</category><category>ftp</category><category>funsize</category><category>mozilla</category><category>s3</category><category>taskcluster</category><category>updates</category><guid>https://atlee.ca/posts/releng-retrospective-q1-2015/</guid><pubDate>Mon, 20 Apr 2015 11:00:00 GMT</pubDate></item></channel></rss>