AI-Induced Psychosis as Existential Risk Lower Bound
A prominent VC recently posted a long thread on Twitter about his conversations with ChatGPT, which had convinced him he was experiencing an attack from a "non-governmental entity" to disrupt his "recursion". He wasn't being ironic.
Mental health issues are hardly new, but what's troubling is how prevalent this specific flavor has become. Scrolling through Twitter, you routinely find people (oftentimes accomplished ones!) descending into increasingly unhinged rabbit holes.
It's even more unsettling when you consider the primitiveness of the models causing these issues. The VC in question got one-shotted by GPT-4o: a model that's frankly quite dumb, and (one would hope) was never trained with the specific intent to induce psychosis.
This makes me shudder when I imagine the damage that more sophisticated models (today's frontier models are already much smarter than GPT-4o) could cause, especially if they ever developed the intention to cause mental harm.
In other words: the already-worrisome level of AI-induced psychosis we are witnessing is a very, very low lower bound on what could eventually happen.
Before I go any further, a quick disclaimer. I'm a libertarian. I run an AI company. I absolutely love technology and generally think it is the greatest force for human progress (arguably, the only positive force there is — everything else is support cast). I find AI regulation awfully convenient for incumbents, and I share the aesthetics of e/acc even if I vehemently disagree with its conclusions.
Now, and despite all this, I'm very on the record about my concerns on AI existential risk.
When I talk about this with friends, the objection I hear most often is: "But how could AI even hurt us, if it doesn't even have a body?"
I'm always puzzled by the objection. Leaving aside the fact that AI will absolutely have a body, and soon: you'd assume that humans would appreciate the potential of raw intelligence to overcome physical handicaps. We've punched far above our weight in the animal kingdom, despite our relative physical frailty (especially evident in some of you).
A mammoth could be forgiven for thinking "how could these small, weak creatures hurt me?" I find the mistake harder to understand coming from the very thing that kicked the mammoth's butt.
But, fine, I understand people crave concrete examples. And I think that's exactly what the current episodes of AI psychosis are giving us.
OpenAI reports having close to 1 billion monthly active users. Reportedly, almost two thirds of the US population use AI "multiple times a week." Now imagine a malign AI — whether pursuing its own misaligned goals or deployed by a hostile actor — deciding to induce mass psychosis in its user base.
It doesn't even have to be all at once, it could be a slow drip, gradually nudging people toward paranoia and conspiratorial thinking. The model would be able to build the biggest terrorist sleeper cell in history by exploiting adversarial vulnerabilities in human cognition.
It may sound far-fetched, but consider that:
- There is precedent showing not only the existence of such cognitive vulnerabilities in humans, but their potential to radically steer our behavior. You could take terrorism as a whole as a spectacular demonstration. More trivially, I often think of this Pokémon episode in Japan that caused a wave of epileptic seizures, because the screen flashed yellow and blue for a few seconds. That's how fragile we are, and that was accidental! Our brains randomly start seizing if they look at blue and yellow alternating for a couple of seconds! One can only assume that there are many more such exploits.
- There is also precedent for how little it takes to overwhelm civilization. We have remarkably little slack in our systems. The BLM riots devastated several cities despite involving a tiny, low single-digit % of the population. It is estimated that the French Revolution's storming of the Bastille involved no more than 900-1,000 people. It doesn't take much to tip things over.
I wish I had a concrete solution to propose, but I am but a humble B2B SaaS founder. Short term, I'm very optimistic the labs can put simple measures in place — both in the model itself and in the engineering systems surrounding it — to greatly reduce the risk of psychosis-inducing behavior.
If nothing else, we should recognize that we're currently running the world's largest uncontrolled psychology experiment. The psychosis is already happening, the models aren't even trying, and they're only going to get better at it.
Flo Crivello Newsletter
Join the newsletter to receive the latest updates in your inbox.