I picked up a flyer at an anti-AI march in London back in February. I can’t say for sure whether the folks at Pause AI were intentionally riffing on South Park’s underpants gnomes, but if they were, they nailed it. The flyer read: “Step 1: Grow a digital super mind. Step 2: ? Step 3: ?” and ended with “Pause AI until we know what the hell Step 2 is.”
For those who missed the 1998 episode, the gnomes’ business plan went: Phase 1: Collect underpants. Phase 2: ? Phase 3: Profit. It’s become a classic internet meme, used to skewer everything from startup pitches to policy ideas. Elon Musk once used it to explain how he’d fund a Mars mission. And right now, it perfectly describes the state of AI.
Companies have built the tech — that’s Step 1. They’ve promised transformation — Step 3. But Step 2, the part where you actually figure out how to get from one to the other, is still a giant question mark.
Pause AI thinks Step 2 has to involve regulation. But what kind, and who enforces it, are open questions. AI boosters, meanwhile, are convinced Step 3 is salvation and tend to skip over the messy middle. They see us racing toward sunny uplands on the back of an “economically transformative technology,” as OpenAI’s chief scientist Jakub Pachocki put it to me a few weeks ago. They know where they want to go — sort of. It’s hazy, still distant. And everyone’s taking a different route. Will any of them make it?
For every big claim about the future, there’s a sobering reality check. Take two recent studies. One from Anthropic predicted which jobs LLMs would affect most — managers, architects, media folks should brace for change; groundskeepers, construction workers, hospitality staff, not so much. But those predictions are basically educated guesses based on what LLMs seem good at in tests, not how they actually perform in the workplace.
Another study, from February by researchers at Mercor, an AI hiring startup, tested AI agents from OpenAI, Anthropic, and Google DeepMind on 480 workplace tasks done by human bankers, consultants, and lawyers. Every agent they tested failed to complete most of its duties.
So why the wide disagreement? A few things. First, consider who’s making the claims and why. Anthropic has skin in the game. Second, most people telling us something big is about to happen base that on how fast AI coding tools are improving. But not every job gets hacked with code. Other studies show LLMs are terrible at strategic judgment calls, for example.
And when these tools actually get deployed, they’re not dropped into a cleanroom. They land in environments contaminated with people and existing workflows. Sometimes adding AI makes things worse. Sure, maybe those workflows need to be torn up and rebuilt around the new tech for it to be truly transformative, but that takes time — and guts.
That big hole is exactly where Step 2 should be. The lack of agreement on what’s actually about to happen, and how, creates an information vacuum that gets filled by the latest wild claim of the week, evidence be damned. We’re so unmoored from any real understanding that a single social media post can — and does — shake markets.
We need fewer guesses and more evidence. That requires transparency from model makers, coordination between researchers and businesses, and new ways to evaluate this technology that tell us what really happens when it’s rolled out in the real world.
The tech industry — and with it the global economy — rests on the promise that AI will be transformative. That’s not a sure bet yet. Next time you hear bold claims about the future, remember that most businesses are still figuring out what to do with their underpants.
Comments (0)
Login Log in to comment.
Be the first to comment!