Yahoo might be the world’s slowest sinking ship, but don’t blame the engineers. They’re still doing everything they can to keep it afloat. Or, at least, to maintain whatever reputation it has left as a home of innovation and useful products.
On Monday, for example, Yahoo Engineering open sourced a deep learning framework called TensorFlowOnSpark
. In terms of utility for organizations with a big data pipeline already in place, this is hard to beat. It lets users train deep learning models using Google’s popular TensorFlow framework, using the even more popular Apache Spark as the compute layer, using the even more popular Hadoop Distributed File System as a unified storage layer.
Never forget: Yahoo is the reason we have Hadoop, which created a whole ecosystem of new big data technologies (including Spark) and helped companies such as Facebook scale into giants.
TensorFlowOnSpark is not Hadoop, but that doesn’t mean it can’t also be useful. It’s the kind of creation only possible by a company with robust enough traffic to identify some edge cases, but also with enough legacy infrastructure in place (and, let’s assume, a constrained enough budget) that building out an entirely new system seems like overkill. Especially with regard to the latter point, thanks in part to Hadoop’s considerable footprint, there are a lot of those companies around.
Yahoo will never again be anywhere near Google, Facebook or even Twitter, probably, but you have to hand it to the company’s engineering team. They’re still building, still committed to open sources and, possibly, making it easier for some mere mortal companies to keep up with ceaseless infrastructure innovation.