The goal of establishing best practices is to:
1. Delivering high-quality data software faster
2. Delivering high-quality data quicker into the software
Thus, delivering value through the data software faster. Delivering software slower but data faster, or the other way around will not help us generate more value.
From the group around Jez Humble et. al. we know that we can weed out the practices to achieve this by looking at the typical manufacturing/ DevOps metrics:
1. Throughput (for software: deployment frequency, for data: data throughput)
2. Lead Time (for software: lead time for changes; for data: time from data sourcing => data usage)
3. Mean Time to Restore (for software & data alike)
4. Change Rate Failure (for software & data alike)
This makes 8 metrics in total we want to watch out for with our best practices.
What does that mean? It means testing incoming data first before pushing it into the production system lowers the Change Rate Failure, but it increases the Lead Time. So that only makes sense, if we are able to run tests really quickly.
It also means adding tests to the data applications is important because it reduces the change Rate failure for software, but if testing is really hard in your current setup, then this increases the lead time for changes and the throughput, both of which are really not good and let the gain of testing go to waste.
Just as with software, we’re aiming to find best practices that allow us to both, decrease the meantime to restore AND decrease the lead time.
That still doesn’t leave us with best practices, but I hope it serves as a good direction to finding them.
Now it’s your turn, reply if you have any good answers, tools, whatever you got. I’d love to hear about them!
And of course, leave feedback if you have a strong opinion about the newsletter! So?