What: This is a very interesting talk from Christopher Bergh, one of the key people in the DataOps space and CEO of datakitchen.io.
My perspective: I love his introduction “deliver the wrong data to a 5,000 sales team and get the call — do this again, and you’re out…”. That’s the life of most data people, and exactly what DataOps is there to solve.
“Chris, I thought this should take 2 hours, not 2 weeks” — exactly the feeling I always have, “it’s just adding this one attribute, how can that be like weeks worth of doing stuff??”
Chris is using the river metaphor to explain, that data people are mostly fixed on solving the downstream problems, but rarely go upstream where everything is about “how do I get more flow out of my team?”.
I do remember my experience when I turned to DevOps and learned about all these things we do to treat infrastructure as code. So the obvious question is: We’re doing this to software, and infrastructure, why not in the data space? And why not to data?
A few more highlights:
“Analytics customers often don’t know what they want. If you give them something they can feedback, you get a better work product in the end.”
“A developer should be able to tell from his desktop whether a change he’s made is good or not. Whether it’s going to break production.”
However, there is also one image I do not like. Chris is always using the assembly line metaphor using cars. He is pointing out, that in DataOps we are really dealing with two assembly lines, one for the new data coming in, and one for our new innovations. But I think he is failing to make the point that currently only one of them, the “innovation pipeline” is behaving like a car assembly line. The other one is much closer to a “bulk process” like cleaning & sorting sand where you focus on statistics and not deterministic steps.
I still recommend you watch it, I really like his style and the experience he backing it up with.