Next generation job scheduling
Posted: | More posts about Mozilla
As coop mentioned, we had a really great brainstorming session on Tuesday about the kinds of things we'd like to do with job scheduling in the RelEng infrastructure.
Our idea is to implement a "job graph", which will be a representation of a set of jobs to run, and dependencies between them. For example right now we have a set of tests that are dependent on builds finishing, or l10n repacks that are dependent on the en-US nightly build finishing. Theses graphs are implicit right now in our buildbot configs, and are pretty inflexible, opaque and hard to test.
One of our design goals for any new system or improvements is to make this job graph explicit, and to have it checked into the tree This has a few really nice features:
- Makes it easier for developers to modify the set of jobs that run on their branch or push.
- Other tools like try chooser and self-serve can use this information to control what jobs get run.
- The sets of builds and tests running on branches follow merges. This is really helpful for our 6-week uplifts.
- It will be possible to predict the set of builds and tests that would happen for a push in advance. This isn't possible right now without horrible hacks.
Our plan is to implement the graph parser and generator first so we can validate some of our assumptions, and make sure we can generate equivalent job graphs to what exists now. After we have that working, we can focus on integrating the new job graphs with the existing infrastructure.