Blog Post

Limitations of LLMs and agents

November 7, 2025 Design by Pavlo Mashchak

As developers and architects, we often treat large language models (LLMs) as if they are the “brains” of modern AI systems. In reality, they are just one part of a much larger, interconnected ecosystem. Understanding where LLMs excel — and where they fall short — is essential for designing intelligent systems that not only generate accurate responses but can also reason, act, and evolve dynamically across tools, data, and services.

Limitations of LLMs

LLMs are incredibly powerful instruments for data manipulation and structured reasoning. They’re trained on massive datasets and excel at the types of linguistic and analytical tasks they were designed for. However, they exist in isolation — disconnected from the external world and unable to retain information beyond a single prompt.

This lack of persistence is intentional. In an agentic ecosystem, an LLM serves as a vital but limited component — a reasoning engine rather than an autonomous system. Its knowledge is static, confined to what it learned during training. You can’t simply “update” an LLM’s understanding by feeding it new answers; retraining requires enormous computational resources, new data validation, and time-consuming fine-tuning cycles.

Instead, we inject context at runtime using prompt engineering, retrieval-augmented generation (RAG), or contextual fine-tuning— supplying relevant data dynamically. Yet even with these strategies, hallucinations can occur. A hallucination happens when the model fabricates facts or produces content that seems accurate but isn’t real. In creative domains, such as generating images or stories, these inaccuracies are often harmless. But in high-stakes environments — legal, financial, or medical — fabricated information can have serious consequences.

Enhancement

To overcome these limits, we must design ecosystems that extend an LLM’s capabilities with structured context and modular logic. The goal is to build systems that supply on-demand context, domain-specific data, and dynamic prompt variationsto help the LLM reason more effectively.

Imagine using an LLM for legal analysis. Instead of prompting it with just a question, you feed it structured information — similar cases, relevant statutes, prior rulings, and court dockets. The model then reasons within that domain-specific context. This process can be iterative: filter and add only what’s relevant, preserve earlier outputs as memory, and reuse them as contextual input for subsequent questions.

By doing this, the system transforms the LLM from a generic reasoning engine into a domain-aware assistant — one that evolves with each interaction. This approach doesn’t make the model “smarter” by itself but makes its outputs contextually grounded and operationally useful.

Orchestration

Enhancement alone isn’t enough. Even with context management, fine-tuned prompts, and external data, an LLM doesn’t inherently know what to do next.It can produce intelligent responses but can’t autonomously decide which steps to take to achieve a broader goal.

This is where orchestrationcomes in — a layer of control that coordinates multiple systems, tools, and agents. Orchestration is the difference between receiving an answer and executing a process.

Enter the concept of the Agentic Workflow— a system architecture where AI can both reason and act. In such workflows, the LLM is embedded in an environment capable of retrieving data from APIs or databases, performing web searches, maintaining long-term context, and switching between multiple related or unrelated tasks. It can decide which information sources are relevant and when to use them.

This is the critical distinction between a chatbot and an agentic AI system. The former responds; the latter decides and executes. The challenge, however, is determining howthese systems know which data to pull, which tool to use, and how to maintain consistency across diverse workflows.

MCP Integrations

To make orchestration practical, we need more than memory and context — we need seamless integration between tools, services, and agents. Yet, traditional methods like REST, RPC, or SOAP create significant overhead. Developers often spend more time building wrappers, managing authentication, or adapting APIs than designing intelligence.

The solution lies in MCP (Model Context Protocol) — a new approach that standardizes communication between agents and services. MCP provides a universal layer of connectivity, allowing AI agents to discover, authenticate, and interact with external systems without custom integration each time.

Think of MCP as the CAN busof the AI world — similar to how the automotive industry unified communication between electronic modules using just a few wires. MCP offers that same simplicity and scalability for AI ecosystems, enabling agents to connect to any MCP-compatible service with minimal configuration.

With MCP, once a server is discoverable, the agent can automatically explore its available:

Resources — data sources like documents, APIs, or vector databases.

Prompts — predefined or dynamically generated instruction templates optimized for context.

Tools — executable components that trigger logic, workflows, or external actions.

This structure allows developers to build modular, reusable, and interoperable AI systems where LLMs, agents, and services operate in sync. Instead of reinventing integration pipelines for every task, MCP enables fluid collaboration between AI components — laying the groundwork for scalable, context-aware, and secure agentic ecosystems.

1 Comment

Ryan Adlard 8:21 pm April 28, 2020 Reply
An has alterum nominavi. Nam at elitr veritus voluptaria. Cu eum regione tacimates vituperatoribus, ut mutat delenit est. An has alterum nominavi.

Write a comment