Why didn't DeepMind build GPT3?
In three short years OpenAI has released GPT3, Dalle-2 and ChatGPT — a stunning set of products that have reframed what many believe is possible with machine learning. ChatGPT has persuaded a large fraction of the world that Artificial General Intelligence (AGI) is a matter of how long, not a matter of feasibility - an extraordinary accomplishment.
You’d be forgiven for having deja vu here - in early 2020 you may have read a similar introduction, but describing DeepMind and the extraordinary successes of AlphaGo, AlphaZero, AlphaStar and AlphaFold. What happened? It’s not so obvious. DeepMind and OpenAI’s similarities are stronger than you might think - both are funded by Big Tech motherships, are of a more comparable size than many appreciate, and have charismatic, formidable, techno-utopian leaders.
Trying to answer the question “Why didn’t DeepMind initiate and deliver GPT3?”[0] is one way of shedding light on this puzzle. I say specifically GPT3 because that was the significant innovation — we’ve been following a playbook since then, and most of the perceived advantage of OpenAI stems primarily from how fast they ship, and their appetite for it, not from the pace of discovery.
As someone professionally interested in how you build extraordinary scientific teams, there are three things that strike me quite profoundly about GPT3.
The first is that there is no real evaluation metric or target for GPT3. Nothing was “solved” when GPT3 was released, in the way that Go or Protein Folding was “solved”. Nobody knew in advance how long you’d have to train GPT3 before it would start to count, and the eerie experience of interacting with GPT3 is not in any way captured by question answering benchmarks. This lack of easily quantifiable measurement is striking in its departure from previous grand challenges in AI.
The second is that there are comparatively few people with traditional elite academic machine learning backgrounds in the GPT3 author list (PHDs in machine learning, people with many thousands first or last author papers) - an organisational departure from the prevailing wisdom on how to build teams in AGI.
The third is the scale of organisational-level risk taking involved in building GPT3. It seems obvious now, but it was in no way clear in 2019 that reducing the language modelling loss on the whole of the internet would lead to the amazing properties we see in large language models. There was significant risk it wouldn’t work out, and the costs - opportunity and financial - to OpenAI would have been significant.
These points are related. They stem from strong organisational, almost philosophical, differences. OpenAI is an exceptionally engineering focused research company, concerned first and foremost on how to build systems that appear to have intelligence when interacted with. This stands in stark contrast to most academic machine learning that is focused more on algorithmic understanding than system performance. Engineering focused papers often have a hard time getting into conferences, with reviewers saying “clear reject” because of “lack of novelty” — it’s “just engineering” after all.
This focus on building systems, not discovering algorithms, leads to technology that looks more like distributed systems software than traditional machine learning. And, consequently, built & led by a team that reflects the different skills and attitude that requires. It’s very unlikely that an organisation whose core cultural metric is elite academic venue papers would invest very significant resources into a project where the route to publication in Nature/Science etc. may be unclear.
OpenAI has realised that the race for perceived leadership in AGI is not run in the most highly cited academic journals in the world. It’s run in the subjective experiences of the users of AI — the use of AI as a product. Nothing has done more for persuading the world AGI is round the corner than ChatGPT, and it is no coincidence that OpenAI is majority led by people whose professional careers have been focused on shipping products. But, the strategy of shipping fast and frequently has its downsides - it’s a little chilling to see the extreme financial pressure to release powerful technology on the world prematurely, as in the case of ChatGPT (and similar)’s rushed integration into Bing.
Finally, in 2019 OpenAI had something to prove. They were commonly viewed as a company without clear focus. Now the shoe is on the other foot, DeepMind (and Google) have to respond.
EDIT: An earlier version of this article had an incorrect statement about GPT3 getting into major ml conferences.
[0] In attempting to answer this question I’m not primarily interested in whether DeepMind had the technical ability or resources to build and serve large language models - clearly they did and still do. My former DeepMind colleagues are extraordinarily talented and not to be underestimated. The race isn’t won yet.