Destroying the World is a Difficult Problem
Or, Why AI Notkilleveryoneism is Filled with Theorists
There's a classic cultural divide in the physical sciences between the two Platonic forms of the scientist: the “experimentalist” and the “theorist.” You've probably seen this trope played out in such cultural touchstones (sigh) as The Big Bang Theory. But the real difference between these archetypes isn't what most people think. Every experimentalist is at least conversant in theory, and every theorist has handled lab equipment, at least when forced to in lab class (yes, I'm an experimentalist, how can you tell?). The crucial distinction lies in where they spend most of their time.
You've heard of the Pareto principle – that 80% of consequences come from 20% of causes? It's painfully true in the physical sciences. A theorist solving a problem might spend days frustrated by one tiny mathematical detail that just won't behave. Maybe it's an integral with no closed form, but has a tantalizingly good approximation they can't quite prove works in all cases. Or perhaps it's a simulation throwing completely bogus results, and they're stuck trying to figure out if it's some weird artifact from their meshing strategy or a fundamental physical assumption they screwed up by accident. Solving these problems, just like debugging code, is a rapid feedback loop, limited only by thinking speed (and occasionally typing, sometimes into ChatGPT).
The experimentalist's world is different. They spend 80% of their time fighting reality. Why did some mysterious previous lab denizen rewire this extension cord to swap the ground and hot wire? And seriously, who does that? Why won't the vacuum system reach base pressure? Oh wait, the manufacturer straight-up lied about whether there was plastic in that component. Great, the magic smoke just escaped from the circuit board, and now all the op-amps need replacing – but they're on back-order for six months. And everyone's favorite: Are my results valid? Am I absolutely sure? Oh... my detector had a dead spot that created a fake signal identical to what I was looking for. There goes that manuscript submission.
Both archetypes spend most of their time debugging, but there's a fundamental difference in the nature of their problems. The theorist's challenges are bounded – difficult, but contained within a well-defined space. The experimentalist faces the unknown unknowns of physical reality, where Murphy's Law isn't just a joke but a tangible daily adversary.
This brings us to the AI doom crowd. While not universally true, this crowd tends to be dominated by theorists (and I controversially include most programmers in this category). Even those who acknowledge implementation challenges often underweight them in their analysis. Writers like Zvi Mowshowitz and Eliezer Yudkowsky (whom I deeply respect) write persuasively and eloquently on the dangers of superhuman intelligences, but there's a tendency in their analyses to compress the gap between intelligence and practical capability, to equate the thought and deed. The common view is that once an AI achieves superintelligence, rapid capability gain is inevitable – it will quickly develop the practical means to reduce humanity to ant-like insignificance. I recognize this tendency. It is the tendency of your advisor to ask you to redo the experiment and just make a few tiny, not-at-all-month-long-project changes, and submit the manuscript by Tuesday — they're theorists.
I don't actually disagree that AI will surpass human intelligence — in fact, I'd argue that GPT-4 is already AGI by any reasonable definition. It can take a decent crack at any problem I can express in natural language, and its results range from quite good to interpretably wrong. I don't even disagree that building things smarter than ourselves is inherently risky. That's true of any advanced technology, and especially true for things that think.
But here's where I diverge from the AI doom narrative: they insist the danger will be quick, that we'll have no warning between AI that can create grey-goo nanobots, and an AI that has created grey-goo nanobots.
The gap between those two is where practical reality kicks in. Any AI theoretically capable of destroying the world would necessarily face the same long, frustrating feedback loop as every experimentalist when they actually try to do it. Imagine, if you will, writing yourself a few-thousand-step list for creating world-eating, paperclip-maximizing nanites, then trying to execute that list. In my experience, you'd get stuck on step two because grey goo is out of stock on McMaster-Carr.
The gap between theoretical capability and practical implementation isn't just large – it's a canyon filled with broken equipment, back-ordered parts, and physics doing what physics actually does rather than what your equations say it should do. Intelligence, no matter how super, doesn't let you skip the debugging phase. While a superintelligent AI might eventually find ways to parallelize efforts or develop novel solutions we haven't considered, these physical constraints require something precious of the hypothetical world-destroying AI: time.
The challenge isn't overcome by raw capability, either – it's the practical reality that destroying the world and killing everyone everywhere is hard and slow. To convince yourself of this, let’s imagine another example: You, personally, are in charge of killing every ant on the planet. You’re much smarter than the ants are, so maybe you could figure it out, but it’s not gonna be quick, and you’re probably gonna need some parts. And unfortunately for the malicious AI, even paperclip maximizers get their parts stuck in customs.
I think "kind vs wicked learning environments" (which I picked up in an avalanche safety course of all places) is a useful mental model here
I am not particularly alarmed by the most strident of the AI doomer scenarios, though I do think there is something to be concerned about, and with respect to your specific response here, I think there is a counter-example.
I'm thinking of Covid-19: how rapidly it spread globally, and how much worse it could have been if it were as lethal as the Spanish flu and/or as resistant to both the immune system and medication as HIV.
This infection avoided your ramping-up concerns by recruiting the most sophisticated manufacturing system we know of: the human body. Furthermore, there are justifiable (IMHO) concerns that the pandemic started with a lab leak, and it is not beyond the bounds of possibility that it was a leak from a gain-of-function experiment. Regardless of whether this was the case, if some future self-aware AI wished to dispose the world of its human overlords, this would seem to be a promising way for it to proceed.