Build vs Buy: A retrospective on Etsy's in-house 'Ship' release platform
with Sasha Friedenberg, Ship co-lead
In 2016, Etsy did something radical. They took a step back, acknowledged that their existing mobile app release process was inefficient and costing the business, and decided to devote headcount and resources to building something called Ship — a homegrown solution to coordinate scheduling, deploy tools, and communication of status and tasks related to releasing their native apps.
They were a pioneer here in astutely recognizing a problem area that, to date, companies have often been slow to acknowledge or tackle the right way. Like many teams (even plenty of smaller ones), Etsy had already invested considerable time and resources in setting up and maintaining complex CI/CD pipelines and other partial solutions. And yet, their mobile releases were still invariably an “event,” each and every time (especially in comparison to the relatively ‘solved’ web platform). With the company newly committed to investing in apps as a huge growth opportunity, various members of the engineering org found themselves organically questioning and challenging an existing process that was reaching a breaking point.
Ship ended up getting built over the course of a couple of years with buy-in from leadership and the dedicated time and attention of several key players in the engineering org. It remains in use today and has been seen as a clear win by the mobile team and the broader organization, even from its earliest, MVP stages.
But how has the Ship project aged? Would Etsy set aside the time and resources and build it all over again? We talked to Sasha Friedenberg, a key driver of the Ship initiative during its creation, to hear his thoughts on the specific environment that allowed for the genesis of Ship, on Ship’s building process, and more around the challenges of mobile releases in 2022.
“It was chaotic, and we had a clear need for something better. There was this huge contrast with how stuff would ship on web, which was a piece of cake.”
Before starting on the Ship initiative, Etsy’s mobile release process was similar to what you’d find at most companies of a similar size and maturity. At the time, Sasha served as Release Manager for Etsy’s buyer apps (another release manager, Jen, was responsible for Etsy’s Sell apps), and for each biweekly release he would find himself juggling several responsibilities, including:
- Sending out reminder emails to particular people at key points
- Organizing testing sessions
- Running various bash scripts that could vary depending on the type of release
- Manually submitting to the app stores
- Communicating status to the larger product and engineering org
- Acting as mediator when responsibilities were unclear or opinions clashed
- Serving as final decision maker for everything
The rest of the team was feeling some of this pain as well; confusion and noise reigned. Team members were getting constant emails about everything during the release process — whether they were an engineering manager who only wanted to stay roughly apprised of overall progress, or an individual contributor on the development team who needed timely reminders to perform specific tasks — which made it understandably difficult to pay closer attention to relevant callouts for themselves.
In late 2015, a number of factors came together to create a perfect storm of unsustainability. As a company, Etsy knew that its mobile apps were positioned to become a huge driver of growth once their development teams were empowered to iterate quickly (and with a roadmap newly decoupled from product strategy on web). A new CTO, John Allspaw, had been recently promoted from a DevOps background, and the engineering org was embracing his war on alert numbness and fatigue. And finally, Jen, the other apps’ release manager, left Etsy for a new role.
Suddenly, Sasha was juggling release responsibilities for two apps on two platforms each, and the problem became even more acute. Increasing headcount wasn’t a preferred solution at the time. A brief exploration into commercial solutions happened, but the answer there was pretty simple — nothing existed to coordinate mobile app teams and cater to the unique challenges that native app releases presented. They also reached out to other large engineering orgs to see what others were doing around the same issues — and found that no one had tried to put together a real solution.
So, Etsy decided to tackle the space themselves. Perhaps the final (and key) factor that allowed for such a daunting, resource-intensive effort was the unique nature of Etsy as a company and a business. Of course, the company is well-known for building a $10bn marketplace that prizes and celebrates those with a do-it-yourself mentality — but, surprisingly, this philosophy extends to the organization as a whole, and particularly to the engineering culture. Etsy was uniquely motivated to try to build the solution themselves. And so they did.
Engineering the solution
“We sketched out the release state diagram. And I remember hearing an engineer say, ‘That looks pretty complicated’. And I was like, ‘Yep, it’s complicated. This is what I’ve been dealing with for the last year…’ There was a lot of validation in seeing the complexity of what I had had to manage in my mind.”
A group naturally came together to start working on Ship, consisting of Sasha and a few other interested engineers, including one who had worked on an existing mobile deploy tool that was essentially a wrapper for several bash scripts. Mapping out the scope of the problem made it clear to everyone that this went far beyond the purview of a traditional CI/CD pipeline, or even a creative one. There had to be a single source of truth, and it had to be built from the ground up to make cross-team communication as efficient as possible. So a formal App Delivery Engineering team was commissioned with these key objectives in mind.
Over the course of the next year, Ship went from a first proof-of-concept (reading GitHub, no concept of states), to a second (state machine, sending emails to the Ship team) to an MVP (sending out wider notifications and combining with existing delivery automation). The Ship team knew they had something valuable on their hands when they started getting feature requests and bug reports from people outside of their circle; teams had started to consider Ship a critical part of their workflow.
This second year of Etsy’s development of Ship added the concept of rotating release drivers (an IC selected from a pool to monitor a given release), templated emails with detailed instructions for tasks, as well as branching and further automation — all important features that brought even more value to the platform. But, there were still pieces of functionality that always needed manual attendance, and there was an increasingly complex state machine to maintain. And, gradually, there were fewer and fewer people and hours available to continue devoting to Ship.
Looking back at Ship
Etsy’s ‘do it yourself’ engineering culture had produced a win here, and Ship was seen as a success and value-add by the larger engineering org. Yet, even for a company culture that was determined to build something helpful in-house, there was a consistent question about whether it made sense to continue allocating engineering and product resources to an internal tool.
To be sure, Sasha looks back at the Ship project and team as something he’s immensely proud to have been a part of, and as something only possible thanks to the ethos that had existed at Etsy as a whole and within their engineering org in particular. He credits Ship on a personal level for having allowed him to transition into an engineering role and take his career to the next level.
But at an organizational level, a company’s decision to invest more than two years of time and resources into something like Ship feels like more of a gray area to Sasha today:
“If this had been another engineering org and there had been a tool available, I think the right decision would have been to use that tool… to not try to solve this problem yourself and instead take advantage of folks who really specialize in solving it.”
The Build vs Buy decision in 2022
Today, lots of companies face a decision point similar to the one that confronted Etsy pre-Ship. Mobile has followed (and arguably surpassed) web in establishing itself as a platform through which quick and confident iteration leads to huge business advantages. But the steps companies have traditionally taken to invest in web platform development (more resources, better tooling) don’t translate seamlessly to the mobile domain: you can’t execute true continuous delivery, mobile tends to require a more cross-functional execution effort, and the inherent costs of coordinating tasks among different team members and multiple necessary parts of the mobile process add up each and every time you release.
Engineering orgs understand that with enough time and resources, you can assume a solution to almost any problem. But, with limited resources, the obvious challenge becomes allocating them efficiently to meet core business goals. Time and people devoted to release process and tooling is time and people not devoted to your product, and more broadly to activities which most directly drive value. (This tradeoff actually confronts teams much sooner than they realize, well before Etsy scale.)
Building out tooling and systems in-house also creates a design dilemma for teams and the DevOps engineers involved, resulting in an ongoing conceptual tug of war. Should you build in a way that streamlines the team’s existing, bespoke process (including some of its idiosyncrasies)? Or should you take a clean-slate approach, offering an improved workflow that draws from tried-and-tested industry best practices and evolves along with them? Ultimately, the Ship approach became a bit of a hybrid, which wasn’t always ideal.
All of the above is part of the compelling challenge in front of teams like Sasha’s, and Sasha is grateful to have had the opportunity to work on a solution — one that lives on at Etsy today and remains core to how they release each app, every time. But he notes that, ultimately, when you’re part of an engineering and product team that dedicates much of your time to thinking about clever solutions around process, there’s an opportunity cost in relation to working on clever solutions to business challenges instead.
“I think what a mobile app team is trying to do is ship a product. I don’t think they’re trying to figure out how to ship a product. Let someone else help you do that rather than trying to solve that yourself.”
At Runway, we too believe it makes more sense for engineering and product teams to be experts at solving business challenges, rather than experts at process. As a company, do you want to devote precious headcount to hiring people to think about your mobile release process full-time — keeping track of shifting requirements from Apple and Google, managing each release, maintaining tooling and automation, and manually overseeing task management and collaboration? Or, naturally, you could rely on your existing engineering team to be responsible for those tasks, too — but which team member should drop off to manage a given release or process change, or fix a bug in release automation or tooling? What could s/he have been working on instead? And how would getting that feature out faster affect your business?
Six years later, there are real lessons to be learned from the Ship experience. But the landscape has changed, and teams today have more choices. We’ve seen that teams that use Runway are able to more fully devote time and resources to real value-creating work for their businesses, and waste less time with the hassles, big and small, that come with managing app releases. And now, crucially, they can do this without taking on the enormous time, expense, and ongoing maintenance that building significant automation and tooling in-house entails — let alone a fully-featured release platform on the order of Ship. In a nutshell, this is exactly why we set out to build Runway.