On Friday, Facebook published a blog post detailing a new open source system called Beringei
. It’s a time-series storage engine that the company uses to serve real-time data about system health to both the humans and automated systems tasked with keeping Facebook online. Because of the scale and real-time nature of Facebook’s operations—"Beringei currently stores up to 10 billion unique time series and serves 18 million queries per minute"—the company had to replace its previous HBase daa store for this workload.
Beringei signifies yet another step in the evolution of how the company is using Hadoop. And where Facebook goes, the industry tends to follow.
Since then, Presto has amassed an impressive list of users
. What’s more, Hadoop vendors Cloudera, Hortonworks and MapR—plus any number of startups, and even users such as Salesforce—have developed their own low-latency Hive alternatives.
Recently (and somewhat ironically), Facebook jumped onto the Apache Spark bandwagon in big way
, after putting Spark through its paces to ensure it could handle Facebook’s giant batch-processing workloads. Spark was created, and became hugely popular, as a simpler, faster alternative to Hadoop MapReduce.
But Beringei is different from previous Facebook creations such as Hive, Presto or Corona
because it doesn’t require Hadoop at all. While it’s a narrow-enough use case that it alone won’t likely have a material effect on HBase usage, or certainly on the overall market for Hadoop software, Beringei might play a small role in a future scenario where Hadoop just isn’t on the radar for many companies.
Already, startups not yet dealing with big data (at least by today’s definition), can likely afford to bypass Hadoop and its relative complexity altogether, opting instead to build around more modern and right-scale technologies from the start. Beringei and Spark are two examples among a sea of open source databases, data stores and real-time processing engines now available.
And as the pool of engineers leave companies like Facebook and other early adopters, to start their own companies or join others, we might expect them to become software Johnny Appleseeds spreading the seed of these technologies and practices across new lands. You hear frequently about ex-Google engineers missing its famous Borg
system once they leave, for example, or about engineers having to rebuilt the same thing over and over as they move from job to job. But the beautiful thing about the open source era is that it’s much easier now to take your favorite tools with you.