Merge queues: An intro for mobile engineers
As a mobile engineer, it’s quite possible you’ve never used, or even heard of, merge queues. This might be due in part to the fact that mobile teams traditionally move a bit slower than their web counterparts (merge queues are more often employed by teams who integrate code at breakneck pace), and because mobile teams tend to isolate production-ready code more protectively (merge queues are extra helpful for teams who are more trunk-based). But merge queues aren’t all that esoteric, even in the mobile context, and they can help solve problems that you and your mobile team might very well face.
In a sentence: combining the individual work of a big team of engineers into a single codebase is hard. And the difficulty lies deeper than your standard merge conflicts, which version control is reasonably good at dealing with. There’s a much more troublesome kind of conflict you run into when simultaneous changes are being made to multiple areas of code that depend on each other. Imagine you add a function call in some new code, and your teammate is busy refactoring that function at the same time, changing its signature or return values. Version control and even tests and checks in CI can’t help with this kind of conflict, but merge queues can. It’s something that smaller and slower-moving teams usually handle by hand but, like many other specialized tools, merge queues really come into their own for larger, fast-moving development teams.
Whatever the case, as your mobile team grows and you explore ways to deploy more continuously, you might find that merge queues become more relevant. So, in this post we’ll help you prepare for that, with a comprehensive intro to merge queues.
To understand what merge queues are, we first need to understand the fundamental problem they solve. Let's imagine we're working on a project and are looking to merge two pull requests, from branches feature1 and feature2, into our main branch.
From a CI perspective, these three branches are all currently “green”, meaning that they each individually represent a fully working revision of this repo that passes all tests and checks. And, from a code perspective, feature1 and feature2 are both in a mergeable state relative to main.
But this doesn't mean that these pull requests are safe to merge, as strange as that may sound. The problem we face here is that even though we're 100% sure that these two PRs are individually functional and compatible with the current state of the main branch, it's very possible (and frankly quite common) that these PRs are incompatible with each other. This incompatibility doesn’t have to do with code – it’s not the type of incompatibility that would cause a merge conflict. Rather, it’s an incompatibility in functionality. The two branches do represent functional versions of the repo, but only in relation to the current state of the branch they're targeting (main), in isolation. If applied together, the target branch could break, even though there are technically no merge conflicts: tests could now fail or the code could fail to compile outright, for example if feature1 has added code that calls a method which feature2 has changed the signature of, or removed altogether.
Here's an example that could cause this:
Both of these pull requests could be valid when looked at individually, but when applied together, one would cause the other to fail to compile. This is called a semantic merge conflict, and it’s something that’s very common in larger teams and codebases. These conflicts are often caused by the natural difficulty of coordination between the different, distributed feature teams that make up a large org: while one team is adding some code that depends on a given system, another team might be simultaneously changing the behavior of said system. This situation is unlikely to cause issues for small teams, as their local branches are more easily kept up-to-date, but for large teams with a high volume and frequency of pull requests, semantic merge conflicts can wreak havoc. Each feature team believes they're in possession of something that works, until pull requests from multiple teams land and the combination of their code breaks the main branch.
Semantic conflicts are really difficult to resolve after the fact, and their consequences are felt throughout the entire development lifecycle. Not only are further feature contributions stalled and pull requests blocked due to the main branch's broken state, but the release cadence of the app might also be at risk. Delaying a scheduled release due to the inability to compile the latest version of the main branch is a nightmare, but a quick fix by a harried developer under pressure to make the branch "green" might introduce even more problems, possibly leading to severe issues slipping into the release.
Part of the blame for semantic conflicts lies with the tools involved. Git (or any other version control system) can tell you whether or not your branch conflicts with another branch, code-wise, but it cannot determine if the resulting compiled app will actually work, or if you’ll even be able to successfully compile the code in the first place. On the CI side, although the general problem of "check if X works" can be solved for a specific branch or revision, CI isn’t able to pre-emptively perform those same checks on a hypothetical future state of code that results after multiple PRs are merged; CI lacks awareness of the bigger picture. So, in order to give CI that awareness of the fact that parallel branches and simultaneous PRs might semantically conflict with each other, we need to implement a new tool: a merge queue.
Merge queues operate on a rather simple premise: neither Git nor CI were wrong in saying that our feature1 and feature2 branches (from the example above) were, at one point in time, "green" in relation to main. BUT those “green” states are transient and entirely dependent on the state of main just before merging each respective PR. Therefore, any time the main branch changes, we must invalidate and reevaluate the compatibility and functionality of the outstanding PR.
GitHub and most other source control platforms do have a setting that prevents out of date pull requests from being merged, but for large teams moving lots of code, this alone is not enough. If you have many contributors opening a whole bunch of PRs at the same time, everyone will end up wasting hours and hours in a losing battle trying to keep their branches up to date as the main branch changes. And that’s where merge queues come in: to make this whole process work for large, fast-moving teams, we need a tool that can automate the task of keeping many converging workstreams updated and validated.
Essentially, a merge queue replaces the regular “merge” button on a PR and instead requires you to submit your pull request to a serial queue of "things that are ready to be merged" (other PRs). Whenever a PR enters the merge queue, the tool:
- Automatically updates the PR’s branch with the state of the main branch that would result from merging all PRs ahead of this one in the merge queue
- Re-runs all required CI checks, regardless of whether they were already in a “green” state before the PR was submitted to the merge queue.
If all checks pass on the fully updated version of the PR’s branch, then the PR can be merged when it reaches the front of the queue. If the checks don't pass, then the author of the PR in question is notified of the semantic conflict and required to update their branch with fixes for the conflicts. This troublesome PR would hold up the rest of the merge queue behind it, so usually the PR is actually kicked out of the queue. It would need to be resubmitted to the merge queue after it is fixed. And so, because the merge queue automatically prevents outdated or incompatible branches from being merged, we can guarantee that the main branch will always be “green” even with a huge volume of changes being directed at it.
When it comes to actually implementing a merge queue, different options exist, generally dependent on which source control platform you're using. Here are some of the common options:
GitHub merge queues
GitHub offers a native merge queue solution, but at the time of writing it's only available in a limited public beta. This is potentially the best solution for GitHub users as it's implemented under the amazing GitHub Actions umbrella, but for now you need to be one of the lucky ones selected for the beta.
GitLab “merge trains”
Like GitHub, GitLab also has its own native merge queue solution, which they call merge trains. Unlike GitHub, it's already publicly available (but only for Premium users. If you're already a premium GitLab user, this is a very easy way to get started with merge queues, as no additional tooling is required.
If your team isn’t down with cloud tools and would rather host everything yourselves, then you might be interested in the open-source Bors project. Bors is a self-hosted, GitHub-compatible merge queue tool. Users can communicate with the tool through a bot, and you have full control over its features and capabilities. I have used Bors myself in the past and had a good experience – just keep in mind that, given it’s OSS, you might be on your own if you face any serious issues.
It’s not too uncommon for teams at scale to roll their own merge queue implementation! This can be an attractive approach for teams who want total control over exactly how the queue functions and is interacted with, and it often allows for tighter coupling with specific inputs and parameters that govern the pass/fail checks in the queue. Shopify and Strava have written about their experiences building out in-house merge queue tools and might serve as inspiration if you’re considering the same.
As your team grows, your project becomes increasingly complicated, and you look to develop and deploy even more continuously, it’s natural to run up against the limits of what basic Git and CI setups were ever intended to handle. Hopefully, leveraging advanced tools like merge queues can keep your development process running smoothly, even with an ever-expanding number of features and contributors. Have you come across other ways to de-risk development and release cycles for complex products? Get in touch and let us know!