
Agentic systems, currently a significant trend in technology, fundamentally involve leveraging a Large Language Model (LLM) to handle the core reasoning tasks. These non-deterministic agents perform various tasks by connecting to various systems using protocols such as the Model Context Protocol (MCP), the Universal Tool Calling Protocol (UTCP) or even the more traditional REST APIs.
A key characteristic of these agents is their non-deterministic nature, which is rooted in the mathematical probability of the neural network that produces the output. This can be seen as a strength or a weakness depending on how you look at it.
In use cases where the agent is used to discover new ideas, you will definitely want it to go off script to explore and produce output that perhaps have never been thought of. In most scenarios of this use-case though, a human component is almost inevitable to evaluate the output for usefulness.
However, in the cases where the agent is expected to act within very defined boundaries, you will often find this frustrating. It is almost like building an airplane and always expecting it to travel on the ground. For example, consider the use case of using these non-deterministic agents to manage a system. The system itself expects specific processes and procedures to be executed and perhaps even in very specific sequences. If multiple iterative testing is executed on this, it is almost certain where you will encounter the agent going off course and perhaps even unable to perform the required request. Worse still, it might perform a task that is totally undesirable and messes up some crucial records in the system. So, what solutions exist for cases like these?
Due to this inherent unpredictability, which draws a parallel to human free will and decision-making, the traditional testing and observability cycle is inadequate. It is simply not humanly possible to account for 100% of all scenarios. However, there are several controls and mechanisms that can be implemented:
- Endpoints Tightening: The endpoints that we exposed to the agent absolutely need to implement the strictest checks on the input and its dependencies prior to execution. Rejection from these checks needs to be a response with a clear explanation.
- Prompting: Utilizing system prompts and supplemental prompts to guide the agent’s behavior.
- Parameters Adjustments: Use settings like temperature to influence the output’s variability. A low temperature close to 0 makes the output more focused and predictable, while a high temperature close to 1 increases creativity and randomness. It is important to note that setting the temperature to 0 does not guarantee absolute predictability.
- Guardrails: Establishing boundaries and rules to prevent undesirable or harmful outputs.
- Checks and Evaluations: Implementing input/output checks on the agent level and utilizing evaluations, which may involve another agent or a human-in-the-loop, to ensure quality and adherence to goals.
It is important to note that even with the above in place there is still a chance for the agent to provide an unexpected response. So, for use-cases that demand 100% predictable, highly deterministic outcomes, the suitability of an LLM-based agentic system should be reconsidered.
Author:
JenVay Chong is a Senior Principal Solutions Architect and is part of the Product Strategy and Adoption Team at TIBCO with a focus on the TIBCO Platform and Artificial Intelligence. He has 29+ years of hands-on managing, leading, architecting, and developing diverse portfolio of technology projects across many vertical industries. He is a well rounded architect with a passion to get really in-depth to the level of coding and using the latest technologies but at the same time loves to think outside the box all the way up on the business level, possessing an MBA under his belt. His current passion is with everything Artificial Intelligence and is constantly trying to test and push the boundary further on what Artificial Intelligence can do.




