Avoiding release anxiety, part I

Testing, automation, and damage control

I’ve written many pieces about the (mobile) release process and product management from a technical point of view but, in my experience, there is a strong emotional component that is often less talked about. This more human side of things can be what makes or breaks a person's interest in and excitement about working on a product, so I figured it was time to write specifically about this other side. I'd like to share a few thoughts on something we’re probably all very familiar with — how anxiety and stress can emerge in a tech org, especially when you’re working so frenetically to deliver something that you end up burning yourself out for weeks.

It's hard to do everything completely ‘by the book’ when working on a fast-paced project, especially when it comes to product quality. We know from experience that these projects always require concessions in order to deliver the product — technical compromises such as not using the ideal architecture, not testing things the way you should, not compiling a proper changelog as you go, not reviewing pull requests as meticulously as you would have liked, and so on and so forth. Eventually, you’ll find yourself wondering, how many compromises are too many?  Sure, you can take a few shortcuts to deliver a project, but the more best practices you abandon, the less you can guarantee that what you delivered actually works. So where do you draw the line?

To make matters worse, if you're not keeping these concessions in check, you might enter into an infinite cycle of disaster. The more features you add without properly controlling the quality and processes of your project, the worse your codebase will become, which means you'll probably have to constantly push hotfixes to production. And those hotfixes were probably pushed after hours, under constant stress and with little quality control, leading to even more issues which will lead to an even worse codebase – and so on. This endless loop can easily contribute to a sense of despair that ends up lasting for the project's entire lifetime.

Cycle of disaster

I call this Release Anxiety, and in my experience, it can completely destroy an engineer's interest in a product. Engineering can and should be fun, not soul-crushing. So I'd like to offer up a few ideas on how you might break the disaster cycle, reducing release anxiety, increasing trust in your project and team, and hopefully leading to less stressful days.

Effective testing mechanisms

Now, you might think "of course testing makes your life less stressful", and that's true. But what I want to talk about is not the act of testing itself, but how you decide to test things, and more importantly, what you should be testing.

I think testing is something that a lot of people tend to get wrong in the beginning of their careers. I see, time and time again, engineers choosing to focus on maximizing control metrics such as code coverage without ever asking themselves: Am I testing the things that actually need to be tested in this product?

The truth is that “code coverage” doesn't mean anything on its own, and there's no single metric that can accurately tell you if your code is being tested properly. It seems that we as a tech community have created an expectation that you should unit test your code as much as possible — but what I've found with experience is that each form of testing has its own specific purpose. You don't need to unit test everything – the most efficient approach is to instead combine unit testing with other forms of test for a net test suite.

UI testing

I see UI Testing (sometimes referred to as end-to-end testing) as one of the most undervalued tools in the mobile world. When my mother orders food through an app, she doesn't care if the margin of the title label is wrong or if the RGB color of the buy button is slightly off – the only thing that matters is if she can order her food. She needs to be able to select a restaurant, choose a dish and pay. So, what better to confirm that the app works than a perfect representation of what the user will actually experience as they use the app?

Although UI tests don’t completely replace other forms of testing, they're extremely useful as a complement to unit tests, to confirm that the user can navigate through the product's most important flows without it blowing up in their face. I tested this theory for the first time several years ago —with a now-extinct Brazilian food delivery app called Rapiddo — by gathering all flows that we deemed to be important in the iOS app (basically everything involving purchases) and covering those flows with several forms of testing – but mainly UI testing. The result was that, while the Android version of the app required at least one hotfix every couple of weeks, the iOS platform required only two hotfixes in two years. Of course the iOS app did have the occasional UI glitch or minor functionality problem, but nobody ever called our customer support to report that the app was outright not working. 

What made the difference was not the mere presence of the tests themselves, but our deliberation in thinking through exactly what should be tested. One pattern I see with unit tests is people tend to over-test: they test the full functionality of a lower-level component, but they also fully test the functionality of a higher-level component that is also leveraging this lower component. If this lower component can’t be swapped for another, then it's not necessary to test it individually – the functionality of the higher-level one guarantees that the lower one is also working. That's why I find UI Testing to be so useful – if you can navigate through your critical flows, then it's likely that the core components attached to it are working just fine.

Block-based testing

With time our approach evolved, and today my personal preference is to apply one of the more advanced techniques we came up with that I call block-based testing.

In a basic UI test, we create test methods where we instruct the app to go through a flow and perform checks during this process to see if everything is in place. Although this worked just fine, I wasn't really confident in their coverage. It can be hard to test alternative flows without duplicating flows, and the test itself can be quite hard to read, which can introduce other problems. 

A hard-to-read test

In a "block-based" test, the dynamic is slightly different. In this concept, before writing the test itself, I try to write small blocks of tests that represent the exploration of a specific screen. These methods assume that the user is ready to enter the screen, and after doing so, the method checks that the screen works correctly and ends the exploration by going back to the previous one. If one screen can navigate deeply into another one, we recursively design the test so that each block calls the blocks from each screen it could possibly navigate to. Since the "blocks" end by returning to their previous screen, we could construct different testing scenarios by simply queueing these blocks together like LEGO bricks.

A block-based test

This was also helpful to test alternative flows. If I wanted to test to ensure that a user happening to be logged out wouldn’t affect the functionality of a given screen, I could add a launch argument to the test and run the same blocks I was already running for the "logged in” variation of the same test.

A block-based test accommodating launch argument
Adding launch argument
Block-based testing example

In our experience, this type of test is very cheap to develop and maintain, because even if you have to write a completely new test, chances are that you'll only need to code the block of whatever new screen you're testing. The test itself will only queue up different testing blocks – most of which have likely already been written.

But if you want to have more confidence in your project, you also need to structure it in a way that allows your tests to be efficient. I’ve been in situations where we eventually needed to remove tests, as they were adding noise more than being helpful (see: over-testing)— something that can be mitigated by avoiding the types of tests that tend to have issues. For example, in the case of UI tests, try to run them in production environments rather than using mocked data. I did in many instances use mocks and/or a debug environment for end-to-end tests, and more often than not we faced situations where the tests worked fine but the actual production build was failing because our mocks weren't correctly matching the production backend's behavior. Ideally, I believe the test needs to completely match what the user will experience.

This type of testing does have a significant downside though — it’s slow as hell. Since it can be quite time-consuming to run UI tests for every pull request, it is usually implemented as part of a "smoke testing" suite that runs post-merge, nightly, or on-demand.

Still, when compared to other forms of testing, my feeling is that UI testing is one of the best ways to reduce release-related anxiety. If you cover core flows that are critical for your app, unless the tests themselves are poorly conceived or incorrectly implemented, it's unlikely that a user will encounter problems in these flows that are the app's fault (and not the backend's, for example) as the UI test itself is already a good representation of a user running through the app.

Automate all the things

But reducing the anxiety associated with a project and increasing your trust in the development process is not something solved only within the project – there are many things that can be done regarding how the project itself is organized. The idea is that the better your project is organized, the fewer chances you have of making a mistake. One thing that has worked very well for me in this sense is to generate and automate as much as possible, especially tasks that are famously problematic, like generating .xcodeproj files in iOS.

Understanding just how many of these regular, scriptable tasks were error-prone when done manually is what prompted me to finally take extensive automation seriously. One simple example — the number of times I uploaded an app to the store but forgot to change the version, causing the store to reject the build. It's a small, harmless mistake, but think about it: that's about 15 to 20 minutes of time that you'll waste packaging and uploading another build. Why stress over it? Instead, let's build a script that handles versioning so we never have to worry about it again. In an ideal world, you shouldn't have to worry about anything besides the project's code itself. Everything else can be automated and controlled.

I like to apply this to everything that I can. Project files? Generated from an easy-to-read configuration file. Some components require boilerplate setup? Generate them. Localization/image map files? Generated. XML files describing UI? Absolutely not – make everything type-safe code. Testing/packaging/store uploading? I don't ever want to do this by hand! Release rollouts? Entirely executed and monitored in an automated fashion.

Hell, I became so used to the relief that automating annoying tasks gave me that I even started to apply it to real life. Making my own coffee used to make me uneasy – once I bought a machine, I had one less thing to worry about in the morning!

This might require some scripting from your side, but many open-source projects exist for more complicated tasks (like xcodeproj file generation in iOS). These automations might not make a big difference individually, but when combined, the chances of introducing small mistakes that will eventually require hotfixes are drastically reduced. This, in turn, allows you to focus on more interesting things and not stress about minor issues. While some of these improvements can be more complex to implement, the long-term benefits tend to be worth every second invested.

As an example, the project generation setup at Spotify is composed of a large number of Rake scripts that are executed serially, dealing with hundreds of small requirements the project has in order to be able to run properly. This is all done automatically in a way that it's not even necessary for the general developer to know what's going on – the perfect scenario.

Damage control

Alongside testing and automation, reducing release-related anxiety means creating failsafes in case things truly go wrong. Having good ways of testing changes and having good organizational practices reduce those types of issues, but they don't cover all of them. Automation can also be leveraged here, as a stress-free project doesn't use CI just for the usual building and packaging tests – it also relies on automated workflows to detect issues that might become problems in the future, by running a series of static analysis scripts to detect problems that could slip past general code reviews.

There are several tools that can do this, but my favorite for small projects is Danger. This is essentially a pull request pre-processing tool that allows you to code a series of requirements for your pull request, and while it comes with built-in common use cases like running linters, you can effectively code whatever you want and even install plugins to do the dirty work for you.

Danger PR comment example #1
Danger PR comment example #2

I used to add as many things as I could to Danger, especially project metrics. Build times, testing performance, app size, affected libraries, linter errors – all the things I could think of that could silently go wrong and potentially cause bigger issues down the line. Organization rules like additions to changelogs and descriptive commit messages were useful, too.

Linters in general are also extremely effective at preventing mistakes in your project. We normally consider them to be a styling tool, but most (if not all) linters allow you to create custom rules for just about anything that can be conveyed as a regex. I've had success using them to enforce other forms of project requirements, such as disallowing the usage of certain APIs, or forcing every model in the app to contain a mocked variant in the testing bundle.

Metric-taking is also an extremely valuable tool. This is already a standard for large tech companies so it might not come as a surprise, but it can also be useful for smaller-scale projects. I developed a small Swift metadata-extracting tool called SwiftInfo that I implemented for a smaller team. I designed it to fetch as much information as I could from the project and upload it to our team's Slack, and the idea was to automatically run it for every release build. I would then set up a small script to store the results and create a graph showing the evolution of the project, giving us the ability to immediately surface any potentially destructive things slipping into a release.

Graph showing number of warnings over time, from data gathered by SwiftInfo

A real-life example of this happening was when I worked on the famous Brazilian food delivery app iFood. On a beautiful release Friday where everything seemed fine, SwiftInfo warned us that the app binary grew by over 200 megabytes:

Slack message from SwiftInfo

This happened because one pull request was uploading assets of over 50 megabytes, and because GitHub assumes you know what you're doing, nobody noticed it. If we didn't have an automated metric-taking tool in place, everyone would have had a really stressful weekend.

Hence, the term  ‘Damage Control’ — while it might seem like overkill to jump through a number of artificial hoops to merge a PR, people make mistakes, and these mistakes can easily become disasters that will make you lose sleep. By protecting yourself with a suite of static checks, you can take a big weight off of your back and allow you and your team to focus on things that make your day happier.

It's also worth mentioning Feature Flagging / AB Testing as perhaps the most useful and popular damage control feature, though it’s well-understood enough to not need to go into detail about it here.

Alright! I added some efficient tests, revamped all of our CI, installed Danger and opened a PR that passes all checks. Can I merge it now with the certainty that it will not cause me stress?

No! There's still one thing to worry about merging. You see, there's a little first-world problem that you only face when you have a lot of developers working on the same project – a natural flaw with how git works. This is coincidentally something we already talked about: Merge Queues are an important tool to prevent your main branch from breaking. Make sure to check that article out if you haven't had the chance!

So far, we’ve looked at some of the more tangible tools you and your team can leverage to reduce release anxiety. But there’s a whole other dimension to successfully navigating these waters. Take a quick breather and we’ll soon continue with Part II, looking at the other half of avoiding release anxiety: how to communicate and collaborate with your team in a way that won’t cause misunderstandings and extra stress.

Mobile DevOps

Release better with Runway.

Runway integrates with all the tools you’re already using to level-up your release coordination and automation, from kickoff to submission to release. No more cat-herding, spreadsheets, or steady drip of manual busywork.
request a demo