Skip to main content
BuildLogicStudio
Begin
Index / Journal / Calm deploys: the rollback you will never need

Calm deploys: the rollback you will never need

On the deployment culture we have settled on after five years and roughly four hundred shipped releases: small, slow, boring, easy to undo. A short field report.

C Calm deploys: the rollback you will never need

I have shipped code in three different working cultures: a large product company where deployments were major events scheduled weeks in advance, a Stockholm fintech where deployments were small but stressful, and BuildLogicStudio, where deployments are small and dull. The dullness is the point of this essay.

A calm deployment is one that nobody notices, including the person doing it. It is one in which the engineer pushes to the main branch on a Tuesday afternoon, watches the green check turn green on the pull request, and goes to make coffee. There is no Slack channel announcement, no all-hands warning, no ceremonial code-freeze before it, no retrospective ritual after it. The deployment is small enough and reversible enough that the engineer trusts it more than they trust the long change-control conversation that would otherwise replace it.

The first thing that makes a deployment calm is that it is small. The studio rule is that no pull request lives for more than three working days. If a feature is too big to ship in three days it is too big to be one feature. It gets split, the parts that can ship in three days ship in three days, and the parts that cannot are written down in a comment for the next iteration. After enough years of this the engineers stop wanting to write big pull requests.

The second thing that makes a deployment calm is that it is reversible. Every site we operate has a one-command rollback. We have rolled back maybe a dozen times in five years and never under serious pressure, because the small-and-frequent deploys mean that the bugs we introduce are small. But the cheap reversibility is what lets the engineer ship on a Tuesday afternoon rather than waiting for the team to be around. The rollback is the safety net you almost never use, and the reason you can move calmly is that you know it is there.

The third thing — and this is the unromantic part — is that you have to write down what to do when something does go wrong. Every site has a runbook. The runbook is a markdown file in the repository. It says, in order: how to deploy, how to roll back, how to look at the production logs, who to call when the database goes sideways, what the third-party services are and how to log into each of them. The runbook is a deliverable on every Atelier Build because the runbook is what makes us replaceable, and being replaceable is the only honest way to operate this kind of studio.

There is one thing we used to do and have stopped doing: the late-night ship before a launch. Every studio has stories about heroic three-in-the-morning deployments that saved the launch. Those stories are real. Those stories are also a sign that the team has built a system that requires three-in-the-morning deployments, and the next launch will also require one, and eventually one of those deployments will go wrong and the entire team will spend a Friday cleaning it up. We aim for ship-on-Tuesday, leave-at-five-thirty. So far it has held.

Have a project that needs this kind of attention?

Write to us. The engineer who replies is the engineer who would run the engagement.