Life is a balance - blow all your money now on a good time and you will have nothing for retirement. Defer your good times to work hard now and you might find an unexpected twist changes those dream plans. Striking this balance is a continual juggle and one that I’ve been thinking about recently relating to projects and idea generation.
Spend too much time focusing on the current fires and not experimenting, broadening your horizons or learning new skills will eventually mean you have no new ideas to draw on. However ignore those current fires at your peril for without some form of short term stability or success you may not have a future to apply those new ideas to.
A classic conundrum in business is the explore vs exploit algorithm. Given limited resources you can only do one, or the other or a bit of each. All of your future cash flows and business break throughs will come from exploring new ground and experimenting - but doing this work often has a low short term pay off. Given the short term focus of shareholders and the ever present need to pay the bills, often you will exploit your current cash cows to their full extent - good while it lasts but what have you got in development behind it? What are you doing to avoid your business being disrupted, becoming the next Blockbuster
? Too much exploring will limit your current cash flow options, too little and you will impact your future cash flows.
Mathematically this problem has received much attention - most famously known as the “multi-arm bandit”
problem which investigates the best strategy to exploit with pokies.
Naturally, you’re interested in maximizing your total winnings… it’s clear that this is going to involve some combination of pulling the arms on different machines to test them out (exploring), and favoring the most promising machines you’ve found (exploiting). (Algorithms to live by
More recently this algorithm has become important in the machine learning field of reinforcement learning, where the algorithm explores the problem space, looking for solutions that maximise value. Of course the algorithm might settle on a solution that is a local maximum with a better solution being possible - for this reason the algorithms are programmed to occasionally select a random direction so that they can unexpectedly find new solutions. (Pedro Domingos - The Master Algorithm
An important consideration in deciding your strategy is the time frame over which you are investing:
Early on, when there’s much to learn, it makes sense to explore a lot. Once you know the territory, it’s best to concentrate on exploiting it. That’s what humans do over their lifetimes: children explore, and adults exploit (The Master Algorithm
Jeff Bezos has an interesting way to frame this decision with his “regret minimisation framework”:
So I wanted to project myself forward to age 80 and say, “Okay, now I’m looking back on my life. I want to have minimized the number of regrets I have.” I knew that when I was 80 I was not going to regret having tried this. I was not going to regret trying to participate in this thing called the Internet that I thought was going to be a really big deal. I knew that if I failed I wouldn’t regret that. (Algorithms to live by
A fascinating outcome highlighted by the multi-armed bandit problem is the solution (the Gibbons Matrix) shows that if you are in doubt, you should always bias your decision towards exploring:
something you have no experience with whatsoever is more attractive than a machine that you know pays out seven times out of ten! (The Master Algorithm
Practically applying this concept in your work is explored very neatly by Jeff Patton in his article on Dual Track development.
Jeff describes how it is important for teams to be working on both “Discovery” and “Development” workflows in parallel - the Discovery stream focused on maximising learning velocity and the Development theme on maximising the release velocity to get your ideas shipped and into the world. Astro Teller
and the crew from Base Camp
also look at similar concepts.