Data Machina

By alg01

A weekly digest of machine learning curiosities, data science geekery, and other data amenities.

Curated by @ds_ldn in the middle of the night.

PLEASE NOTE: Data Machina is no longer published here. Go to to get the latest Data Machina

By subscribing, you agree with Revue’s Terms of Service and Privacy Policy and understand that Data Machina will receive your email address.






Data Machina - A personal note & important news

When I started Data Machina I thought: Well... maybe a few people will read it, but I’ll try my best. It turns out that 145 issues and 2 ½ years later more than 7,950 people read Data Machina every week.I’ve received amazing feedback, kind donations, awesome …


Data Machina - Issue #144

I love fractals. I still remember the first time I generated a fractal. I really enjoyed reading My Journey into 3D Fractals.I felt like I run a knowledge marathon ater attending Spark+AI Summit London You can watch all the 126 sessions here.


Data Machina - Issue #143

This is really cool: The GAN Lab: Play with Generative Adversarial Networks (GANs) in Your BrowserThere's a lot! going on in Open Source Modern Data Engineering. Check out: Facebook's LogDevice: A Distributed, High-availability Storage for Sequential Data and…


Data Machina - Issue #142

The amazing power of GANs and adversarial learning... Given a video of an expert dancer and another video of an amateur dancer, train the amateur video to dance like an expert... Everybody Dance Now - Watch the video and read the paper hereWhen confronted wit…


Data Machina - Issue #141

I really enjoyed this long read by Stephen Wolfram, Learning about the Future from 2001: A Space Odyssey, Fifty Years LaterIf you're fascinated by Lego bricks like me, you may like reading Duplo Lego Layouts and Computational Train TracksIf you enjoy Data Mac…


Data Machina - Issue #140

Yesterday, +100,000 people watched live how OpenFiveAI (a team of 5 Neural Networks) won a best-of-3 against a team of top DotA (Defense of the Ancients) human professional players.If you're terribly exhausted by the London heat wave and really bored here's a…


Data Machina - Issue #139

At the MIT Brains, Minds, and Machines Symposium, the famous linguist Chomsky replied to Steven Pinker's question about the success of probabilistic models trained with statistical methods. Then the ever great Peter Norvig (Director @Google Research) wrote th…


Data Machina - Issue #138

Why is Uber AI doing this? What's the point? In Autopsy of a Deep Learning Paper, Filip exposes the shallowness of the current deep learning research masked by the ludicrous amount of compute. Good read.If your are in London and the torrid weather hasn't fri…


Data Machina - Issue #136 (copy)

How to train a language model using an LSTM neural net? OK so “I grew up in Spain … I speak fluent [ ]"... You  have to predict the last word in that sentence... You can do that with ml5.js Friendly Machine Learning for The Web and an LTSMNow that the World …


Data Machina - Issue #136

Back in 1966, Joseph Weizenbaum coded Eliza - an NLP chatbot for human-machine communication. afaik chatbots haven't evolved "in a hugely significant" way since then. This is a wonderful story: The Genealogy of ElizaI succesfully survived the scorching hot Lo…


Data Machina - Issue #135

Belief Networks a.k.a Bayesian Networks (BNs) are coming back. If you have well known priors and medium-sized data  -like most business in real life- you can learn from data under uncertainty with BNs. Several years ago I met John and I invited him to give a …


Data Machina - Issue #134

A big issue in every company is: How do we automate the ML lifecycle? Many startups and enterprises are now building ML platforms, but that is not easy. In fact imo many of those inititatives will fail miserably. If you're enduring the pain of building an ML …


Data Machina - Issue #133

Here's a new, important thing: TD= DP + MC Temporal Difference Learning: Combining Dynamic Programming and Monte CarloAI Nationalism. I'm trying! to learn Chinese and planning to visit Beijing again soon. My friends in China keep telling me about the fast-pac…


Data Machina - Issue #132

Crucially, there's a renaissance of tools for visualizing large-scale, high-dimensional data. Uber just open sourced Kepler - an awesome geospatial analysis tool for large-scale datasets. Days ago, Google open sourced a Tensorflow.js library for real-time int…


Data Machina - Issue #131

Ages ago I organised a hack on ML & social graphs with Ferenc. We spoke about graphs and causality. Clearly I was totally overwhelmed by his massive knowledge of the subject. I just read Ferenc's lastest essay: ML beyond Curve Fitting: An Intro to Causal …


Data Machina - Issue #130

Because not everything is AI & Deep Learning, here is an Intro to Topological Data Science.  I recall some intense discussions with some of you on Topological Data Analysis (TDA): Is TDA the same as Clustering?? Here Mr TDA Guru explains in detail Why Top…


Data Machina - Issue #129

Last week @ICLR2018, Facebook AI Research open-sourced lots of new, state-of-the-art AI tools and libraries.So many ML tools... Michael's KAML-D integrates Tensorflow, Juyperthub, PrestoDB, Elasticsearch and Kubernetes. A perfect example of an Open, Modern ML…


Data Machina - Issue #128

Machine Learning model interpretability vs model performance... What's the right balance? Here's a great presentation on applying Bayesian inference to generate human-interpretable decision sets: Human in the Loop: Bayesian Rules to Enable Explainable AI (pdf…


Data Machina - Issue #127

There's so much hype and bs in AI... This is a great essay by Berkeley's Professor Michael Jordan in which he explains why AI is an intellectual wildcard and what we need is AI as a New Engineering Discipline.You can watch Michael's latest talk at SysML18 deb…


Data Machina - Issue #126

This is fascinating stuff: These guys trained an intelligent dog agent using dog-centric view videos Who Let the Dogs Out? Modeling Dog Behaviour from Visual DataAnd here's an excellent, tour de force post on how difficult and time-consuming is to reproduce a…