The Limits of LLM Based Systems

Why Critics Remain Skeptical

December 20, 2025

The LLM critics acknowledge that though tool use and agent networks will enable increasingly complex 'bureaucracies' that can perform more sophisticated, well-defined tasks. A network of LLM agents with proper verification mechanisms might reliably process insurance claims, or generate code from detailed archtecture specifications.

But that these improvements don't address the core limitation: LLMs fundamentally remain gravitationally pulled toward common patterns in their training data. And that this, together with The Exponential Error Accumulation Problem fundamentally means any system build on top of LLMs will fundamentally be doomed to fail when an LLM has to step outside of its known territory and work on novel research tasks.

Pattern Completion versus Pattern Creation The question what LLM-based systems will be capable off challenge us to refine our understanding of the relationship between language and intelligence. Human cognition and communication is deeply linguistic. We structure our thinking into words and language when we communicate. Yet LLMs demonstrate that linguistic competence alone is not equivalent to understanding or reasoning.

The philosophical question at the heart of this critique concerns the relationship between language and intelligence. LLMs are constrained by the linguistic and conceptual frameworks present in their training data. They can recombine and interpolate within that space with impressive fluency. and given a piece of context, manipulate the language in accordance with their training data to describe or reason their way through things...

But the intelligence required to work on novel problems may require something more: the ability to recognize when the current linguistic framework is inadequate to find a solution, together with the ability to construct entirely new conceptual frameworks.

Example from Mathematics In therms of mathematical proof theory (make this way more accessible to the everyday reader): As an example, consider the difference between these two capabilities: Given an axiomatic system, together with a large set of examples of how to manipulate syntax within that specific axiomatic system, and LLM might be able to reason its way up from there and classify proofs of all larger statements.

Pattern completion: Given examples of proofs in Euclidean geometry, derive new proofs using the same axioms

an LMLM would struggle to come up with new axiomatic systems and reason its way through their implications.

Pattern creation: Recognize that Euclidean geometry cannot describe curved space, and invent non-Euclidean geometry derived from fundamentally new axioms

LLMs excel at the first. The second may require a different kind of system entirely. The way they work might very well prevent them from making the kind of paradigm shifts required to step outside existing language games, and create a new conceptual framework required to understand a novel pattern not seen before.

Examle from the History of Science As an example to further illustrate what we mean by this, let's have a look at how ur language to describe the world has changes over the past centuries. The philosopher of science Thomas Kuhn came up with the concept of paradigm shifts in the history of Science. He observed that scientific progress is more than just an accumulation of facts. According to Kuhn, the large 'scientific revolutions' are also related to the development of new conceptual frameworks — new languages - that describe reality.

Consider how our language-based description of the solar system evolved over time:

"The planets move through the sky" (Observation)
"The planets circle around the sun" (Heliocentric model)
"The planets are held in orbit by gravity" (Newtonian physics)
"The planets move in a straight line through spacetime curved by mass" (General Relativity)

Each statement doesn't merely describe different facts—it employs a fundamentally different conceptual language. The transition required recognizing the inadequacy of the previous framework and constructing an entirely new one... This is precisely what LLMs, by their nature, struggle with. They can operate masterfully within existing language games but may be fundamentally unable to create new ones

To what extend it is possible to gain this ablity with systems build on top of LLMs remains an open question.

Tool execution and agent networks can help LLMs work more reliably within well-known domains. But when the solution to a novel research programme might require a genuine conceptual innovation, recognizing that the current framework is inadequate and constructing a new one, these architectural improvements may not be enough. An LLM might inherently be stuck within the known patterns of its training data, trying to repharase the phoblem in terms of the most common patterns it knows about. In that sense the training data might define the boundaries of the possible.

How Humans Learn LLMs only learn by 'reading written text'. But when we look at how how humans learn, reading text is only a relatively small (and last step) in our education. Before we read, we spend years observing the world as it is, and derive from that a kinds of temporatal, and realtional understanding that we later in life learn to express through language. But manipulating language might not be all there is to our intelligence?

This is why critics argue that achieving AGI may require moving beyond LLM-based systems entirely over the next decades. In order to achive this 'missing' bit with AI, we might indeed need to focus on world models, that can learn to construct a novel linguistic description of the world they have been observing. Rather than LLMs that merely complete a linguistic description they have already been trained on... rather than merely completing descriptions they've been trained on. ."_

Continue reading:Towards Potential Solutions