The Limits of LLM Based Systems

Conclusions

December 20, 2025

The exponential error problem and LLM scalability paradox reflects most people's attitudes toward and experiences with LLM agents.

On the one hand, LLM agents achieve remarkable feats of sustained coherence when performing simple tasks, generating or analyzing documents and computer code, and executing increasingly complex workflows. The genuine capabilities that these systems exhibit when operating within well-established domains remain truly impressive.

On the other hand, the exponential error accumulation problem is real and observable in practice. Anyone who has worked extensively with these systems knows they produce torrents of "slop": plausible-sounding but ultimately unusable output that requires constant human curation, revealing the limits of competence without comprehension.

Understanding the limitations of these systems better will help us understand how best to deploy these technologies: Should we expect a future with autonomous LLM agents that work unsupervised on cognitive tasks? Or should we expect a future where LLM agents become a new kind of interface to interact with computers and data—where humans provide judgment, architectural vision, and the creative leaps that pattern-matching alone cannot achieve?

Rather than autonomous engineers, these agents may be better understood as a new kind of interface that helps engineers translate their intention into executable code faster than typing. But in their current form, LLM agents still require substantial human oversight and curation. The question remains: is this a temporary limitation that will be overcome with better models, or does it reveal something fundamental about the nature of autoregressive language models?

The next decade may prove the pessimists right, or reveal that we're in the earliest days of a genuine revolution. The answer will determine not just the future of AI, but the future of human expertise itself.

it is not only thinking tokens that brings us to AGI. BUt the combination of many improvements together that amplify each other.

Continue reading:Introduction