There were a handful of interesting items coming out of Redmond on Tuesday, mostly related to Microsoft Azure and the data centers in which it runs. Here they are:
Microsoft says 40 percent of all VMs in Azure now are running Linux (ZDNet):
This is up from about a year ago, when the company claimed nearly a third of all Azure VMs were running Linux. Something tells me this percentage is only going to rise, especially as Microsoft ramps up its Kubernetes efforts in an attempt to lure in more developers.
Microsoft is getting hungry for fuel cells (Bloomberg):
This in an attempt to increase energy efficiency, possibly even doubling it. The problem right now is that energy cells are expensive – Microsoft is looking to spend about $45 million developing a 10-megawatt system that could be deployed in a few years. The story ends with a great quote, which I suspect resonates with most webscale companies building their own stuff (including Facebook, which I believe is partnering with Intel on designing AI chips
so it doesn’t have to build its own):
“Essentially what we want to do is get out of R&D as soon as possible and actually use them to power the data center, we hope to do that as soon as possible. Unfortunately we have a supply chain issue. I need hundreds of megawatts of these things.”
Microsoft Azure and Microsoft Research take giant step towards eliminating network downtime (Microsoft Research):
So, Microsoft is helping minimize network downtime by building an exact replica of the Azure network architecture via an emulator called CrystalNet. Before changes are pushed live, they’re tested on CrystalNet to ensure there won’t be any hidden bugs or other issues. What’s even cooler is that the blog post suggests Microsoft might commercialize the technology by selling it directly to customers, or perhaps by making available to networking vendors to test their products.
In other news, check these three items from the world of data science and artificial intelligence:
The state of data science and machine learning, 2017 (Kaggle):
This is a pretty insightful (and interactive) survey by Kaggle, which includes responses from more than 16,000 community members. A couple high-level highlights findings include the popularity of logistic regressions and other “traditional” techniques over neural networks in work environments, and the importance of education in data science. Of over 15,000 respondents, the vast majority have at least a bachelor’s degree, and more than 41 percent have a master’s degree.
Turbocharging Analytics at Uber with our data science workbench (Uber Engineering):
This seems like a fairly progressive approach to bringing data science tasks onto a unified platform, rather than being distributed across a collection of various tools. While Uber’s data stack is varied enough and complex enough that building its own platform makes sense, I have to imagine there are off-the-shelf platforms out there for smaller companies or companies that don’t want to invest too much budget into developing their own IT.
Copyright law makes artificial intelligence bias worse (Motherboard):
This is a fascinating discussion about the legal issues surrounding what datasets are available for training machine learning models. I’m not certain how big a deal it is for specific use cases such as search or facial recognition, where companies like Google and Facebook can generate more than enough of their own proprietary data, but you can see where reliance on open datasets could create problems for smaller companies or applications that are trying to act beyond the scope of their training data.