Interesting profile of NumerAi,
a fund which makes investment decisions based on an ensemble of models submitted by data scientists. Think Kaggle meets Mechanical Turk meets Renaissance Technologies. Contributors are anonymous and paid in Bitcoin.
Crowdsourcing has been a favourite tool for the data scientist, as it is one of acquiring ground-truth data. In NumerAi’s case, investment strategies are crowdsourced from scientists contributing their ingenuity. Several years ago, PeerIndex used Amazon Mechanical Turk to build some of our ground truth datasets.
Last week EV touched on the cost of how the cost of training AI systems was getting out of reach for many researchers and, indeed, small firms. This week, Andrej Karpathy from OpenAI took sharing one stage further by releasing an image classification network - model & calculated weights into the public domain. The particular model took more than five days of training on an 8 GPU rig, unobtainable to most
. It’s an interesting case for how we can democratise access to this technology. (h/t @guillaumeallain). Another example is RASA NLP
, an opensource API released this week.
EV reader, Auren Hoffman, makes a similar argument
for widening access to training data: “If we want to massively accelerate artificial intelligence and improve human lives, we need to democratize access to data.”
[it is] pretty clear that algorithms were going to be used in a way that could affect individuals’ options in life
As an aside, Dwork also invented differential privacy (now used by Apple to protect personal information).
Every company should have in place guidelines that govern the ethical management of its operations … [a]nd those same standards of ethical business conduct should guide the development of AI systems.