Avoid the MCP Server Overload

MCP Overload
MCP Overload
Reading Time: 3 minutes

The continuous evolution of artificial intelligence has led to the emergence of the Model Context Protocol (MCP). This concept emerged to enable Large Language Models (LLMs) to interact with external tools and services, extending their capabilities beyond their inherent linguistic functions. Essentially, MCP provides a standardized way for LLMs to make use of tools to accomplish tasks that are outside their core reach, such as fetching real-time data, performing calculations, or interacting with other software systems.

When designing an MCP server, it’s crucial to consider more than just functional requirements. A common pitfall is to over-enthusiastically overload MCP servers with every conceivable tool. 

Studies and practical experience suggest that the performance of LLMs can degrade significantly when faced with too many tool choices. For example, some models can become confused with more than 40 tools, while smaller or quantized models may struggle with far fewer (as low as 12-16 tools). This isn’t necessarily due to context window limitations, but rather the LLM getting tool names and definitions mixed up, hallucinating tools, or failing to follow instructions.

Therefore, it is paramount to be careful, deliberate, and measured in the selection of tools for your MCP server. This careful curation directly impacts the cost-effectiveness and overall performance of your AI agent. Best practices emphasize adding fewer tools. Rather than providing an LLM access to every available tool, it’s better to limit the selection to only those most relevant to the task.

Strategies to mitigate the issue of too many tools include:

  • Tool Loadout/Limiting Tools Per Conversation: Activating only the tools known to be needed for a specific conversation or task. Some clients allow for this selective activation at the conversation level, or application-wide.
  • Curating MCP Servers: Directly editing your client’s configuration to disable unnecessary servers or tools.
  • Tool Filtering with Groups and Tags: Employing in-protocol methods to filter tools based on groups or tags.
  • “Proxy” Layers: Using an MCP proxy to manage connections to multiple MCP servers and intelligently select appropriate tools.
  • Scoped Servers or Namespacing/Tagging: Organizing tools into scoped servers or using namespaces/tags to maintain relevance per project.
  • Sub-Agents for Tool Selection: Implementing a “librarian” sub-agent that can find and provide appropriate tools for specific tasks. This is already baked into some systems where the system itself searches for the right tools rather than adding all of them upfront.

Ultimately, context management is key to effective LLM use. LLMs are stateless, and every interaction requires feeding them the necessary information, including tool schemas and instructions. If tools consume a significant portion of the context window, it limits the available “memory” for the conversation itself. By carefully managing the tools provided, you can optimize LLM performance and ensure the most valuable tools are leveraged.

To build your MCP server to leverage this capability with AI agents, you can use TIBCO Flogo® Connector for Model Context Protocol (MCP) – Developer Preview.

Author:
JenVay Chong

JenVay Chong is a Senior Principal Solutions Architect and is part of the Product Strategy and Adoption Team at TIBCO with a focus on the TIBCO Platform and Artificial Intelligence. He has 29+ years of hands-on managing, leading, architecting, and developing diverse portfolio of technology projects across many vertical industries. He is a well rounded architect with a passion to get really in-depth to the level of coding and using the latest technologies but at the same time loves to think outside the box all the way up on the business level, possessing an MBA under his belt. His current passion is with everything Artificial Intelligence and is constantly trying to test and push the boundary further on what Artificial Intelligence can do.