Reasoning LLMs Deliver Value Today, So AGI Hype Doesn't Matter

Simon Willison, commenting on the recent paper from Apple researchers that found state-of-the-art large language models face complete performance collapse beyond certain complexity thresholds: I thought this paper got way more attention than it warranted -- the title "The Illusion of Thinking" captured the attention of the "LLMs are over-hyped junk" crowd. I saw enough well-reasoned rebuttals that I didn't feel it worth digging into. And now, notable LLM skeptic Gary Marcus has saved me some time by aggregating the best of those rebuttals together in one place! [...] And therein lies my disagreement. I'm not interested in whether or not LLMs are the "road to AGI". I continue to care only about whether they have useful applications today, once you've understood their limitations. Reasoning LLMs are a relatively new and interesting twist on the genre. They are demonstrably able to solve a whole bunch of problems that previous LLMs were unable to handle, hence why we've seen a rush of new models from OpenAI and Anthropic and Gemini and DeepSeek and Qwen and Mistral. They get even more interesting when you combine them with tools. They're already useful to me today, whether or not they can reliably solve the Tower of Hanoi or River Crossing puzzles. Read more of this story at Slashdot.

Jun 19, 2025 - 18:24

Reasoning LLMs Deliver Value Today, So AGI Hype Doesn't Matter

Simon Willison, commenting on the recent paper from Apple researchers that found state-of-the-art large language models face complete performance collapse beyond certain complexity thresholds: I thought this paper got way more attention than it warranted -- the title "The Illusion of Thinking" captured the attention of the "LLMs are over-hyped junk" crowd. I saw enough well-reasoned rebuttals that I didn't feel it worth digging into. And now, notable LLM skeptic Gary Marcus has saved me some time by aggregating the best of those rebuttals together in one place! [...] And therein lies my disagreement. I'm not interested in whether or not LLMs are the "road to AGI". I continue to care only about whether they have useful applications today, once you've understood their limitations. Reasoning LLMs are a relatively new and interesting twist on the genre. They are demonstrably able to solve a whole bunch of problems that previous LLMs were unable to handle, hence why we've seen a rush of new models from OpenAI and Anthropic and Gemini and DeepSeek and Qwen and Mistral. They get even more interesting when you combine them with tools. They're already useful to me today, whether or not they can reliably solve the Tower of Hanoi or River Crossing puzzles.