AGI fantasy is a blocker to actual engineering

14 Nov 2025

Reading Empire of AI by Karen Hao, I was struck by how people associated with OpenAI believe in AGI. They really do think someone, perhaps them, will build AGI, and that it will lead to either the flourishing or destruction of humanity.

Elon Musk founded OpenAI because he thought Demis Hassabis was an evil genius who would build AGI first:

…Musk would regularly characterise Hassabis as a supervillain who needed to be stopped. Musk would make unequivocally clear that OpenAI was the good to DeepMind’s evil. … “He literally made a video game where an evil genius tries to create AI to take over the world,” Musk shouted [at an OpenAI off-site], referring to Hassabis’s 2004 title Evil Genius, “and fucking people don’t see it. Fucking people don’t see it! And Larry [Page]? Larry thinks he controls Demis but he’s too busy fucking windsurfing to realize that Demis is gathering the power.”

OpenAI’s co-founder and chief scientist Ilya Sutskever regularly told audiences and employees to “feel the AGI”. At a company off-site in Yosemite in September 2022, employees gathered around a firepit:

In the pit, [Sutskever] had placed a wooden effigy that he’d commissioned from a local artist, and began a dramatic performance. This effigy, he explained represented a good, aligned AGI that OpenAI had built, only to discover it was actually lying and deceitful. OpenAI’s duty, he said, was to destroy it. … Sutskever doused the effigy in lighter fluid and lit on fire.

I think it’s remarkable that what was until recently sci-fi fantasy has become a mainstream view in Silicon Valley.

Hao writes that GPT-2 was a bet on the “pure language” hypothesis, that asserts that since we communicate through language, then AGI should emerge from training a model solely on language. This is contrast to the “grounding” hypothesis, that asserts an AGI needs to perceive the world. Successfully scaling GPT to GPT-2 convinced enough people at OpenAI that the pure language hypothesis was valid. They just needed more data, more model parameters, and more compute.

So the belief in AGI, plus the recent results from LLMs, necessitates scaling, and justifies building data centres that consume hundreds of litres of water a second, run on polluting gas generators because the grid can’t supply the power (and might use as much power as entire cities), driving up CO2 emissions from manufacture and operation of new hardware, and exploits and traumatises data workers to make sure ChatGPT doesn’t generate outputs like child sexual abuse material and hate speech or encourage users to self-harm. (The thirst for data is so great that they stopped curating training data and instead consume the internet, warts and all, and manage the model output using RLHF.)

And this is all fine, because they’re going to make AGI and the expected value (EV) of it will be huge! (Briefly, the argument goes that if there is a 0.001% chance of AGI delivering an extremely large amount of value, and 99.999% chance of much less or zero value, then the EV is still extremely large because (0.001% * very_large_value) + (99.999% * small_value) = very_large_value).

But AGI arguments based on EV are nonsensical because the values and probabilities are made up and unfalsifiable. They also ignore externalities like environmental damage, which in contrast to AGI, have known negative value and certain probability: costs borne by everyone else right now.

As a technologist I want to solve problems effectively (by bringing about the desired, correct result), efficiently (with minimal waste) and without harm (to people or the environment).

LLMs-as-AGI fail on all three fronts. The computational profligacy of LLMs-as-AGI is dissatisfying, and the exploitation of data workers and the environment unacceptable. Instead, if we drop the AGI fantasy, we can evaluate LLMs and other generative models as solutions for specific problems, rather than all problems, with proper cost benefit analysis. For example, by using smaller purpose-built generative models, or even discriminative (non-generative) models. In other words, make trade-offs and actually do engineering.

Discuss on Hacker News.