OpenAI announced GPT-3
, a new natural language processing (NLP) model with 175 billion
parameters and impressive capabilities. We talked about its predecessor, GPT-2, before (here
), which could already (sort of) play chess, write poetry and do arithmetic. The new version has made progress on at least the latter two of these; Scott Alexander gives some striking examples here
(If you read one piece on GPT-3, read that).
OpenAI acknowledges that there’s no fundamental breakthrough here. What’s new is sheer scale and the performance boosts that seems to bring, depending on whose analysis you follow. There’s a debate on the future of artificial intelligence that (crudely) divides those who think that we need radically new techniques to crack “artificial general intelligence
” (AGI) and those who think we just need to scale existing techniques. We talked about this last year
and the “bitter lesson
” that “building in how we think does not work in the long run”.
If you’re in the latter camp, the case for worrying about GPT-3 is best expressed by Gwern’s excellent essay
- or in single tweet, here
: scale seems to improve performance a lot and GPT-3 shows no sign of an upper limit on this. Not everyone agrees: there’s a case for viewing GPT-3 as a let-down, as argued in this post
. But, as Gwern says, these capabilities would have seemed astonishing, even to leading reseachers, just five years ago. As one of the best essays on AI safety has argued, there’s no fire alarm for AGI