When a decision is hard, that means it’s easy. The very thing that slows you down—having multiple options that are very close in quality—is actually a signal that you can go fast, because this tells you that whichever option you choose, you can’t possibly be that wrong, since both options have similar upside and downside potential. (From “How to Decide” by Annie Duke)
The only way to ‘solve’ an intractable problem is to reject its assumptions. Alexander the Great, faced with the impossible task of untangling the Gordian knot, pulled out his sword and chopped it in half. (From “Optionality” by Richard Meadows)
A survey of how companies deal with incidents today, and a peek into the best practices of the future.
If you already have incident review hygiene in place, challenge yourself to make this better. Do you give “soak time” to teams – the time to process what happened – before doing a review? Are your reviews truly blameless, making people feel good about having gone through an incident, rather than being discreet finger-pointing exercises? Do you have reviews where the goal is not to get to action items, but to discuss learnings?
Focus on learning over action items, if you want to take incident handling to the next level. As a warning, you should learn to jog before you run. Don’t jump to this step if your team has not gone through all the previous stages. But do look at companies like Honeycomb or people like John Allspaw as inspiration for how you can focus more on learning, and less on the processes.
Are you making meaningful changes after incidents? Analyzing why an incident happened is important, but without taking action and making changes, this analysis is not worth much. Often you’ll find the changes you need to make are complex and time-consuming. Are you following through with this work, and making systemic changes to improve the reliability of your systems?
There is an open incident database that is full of goodness from how Github, Spotify and others have handled their incidents.
Thinking is split roughly into pattern matching against our experiences, or reasoning from first principles.
The author’s belief is that it isn’t enough to have one or the other; you really need to have both.
Whether you do first principles thinking or perform some form of pattern matching really depends on the problem you’re trying to solve, the domain you’re working in, and all sorts of context-dependent things that you can’t generalize away. The art of good thinking lies in finding a balance between the two modes.
You can sort of squint at the author’s posts and see the shadow of that question lurking in the background: In Much Ado About The OODA Loop, the author wrote about John Boyd’s ideas, and in particular Boyd’s belief that good strategic thinking depends on accurate sensemaking, which in turn depends on repeatedly creating and then destroying mental models of the world.
But there’s a more pernicious form of failure, which occurs when you reason from the wrong set of true principles.