Agent verification

From Wikipedia, the free encyclopedia

Agent verification is activity to gain assurances that purposeful artificial constructs act in accordance with their specifications. While primitive forms of inorganic agents have been used in manufacturing for centuries, the study of artificial agents did not begin until the mid 20th century. Foundational work on such agents was closely bound with the emergence of artificial intelligence as an academic discipline. Early agents deployed for industrial control systems and in computing were often controlled by quite simple logic however, not involving artificial intelligence as such. When deployed as part of a multi-agent system, even such simple agents could require special agent orientated testing methods, as their collective behaviour was challenging to verify with traditional testing techniques. Difficulties in providing assurances that agents will not behave in dangerous ways became more prevalent after the introduction of LLM agents, especially after the rapid acceleration of their deployment in 2025.

The verification of agent behaviour can be conducted by formal or informal methods. Informal verification requires less mathematical skill. But when agents are part of systems where errors have significant risks — such as danger to human life, environmental damage or major financial loss — formal verification is preferred. Both regulators and system designers themselves like formal verification as it provides a high degree of mathematical certainty. It is not however always possible to formally test all aspects of an agent based system's behaviour, especially where newer LLM based agents are concerned, due in part to their high degree of autonomy. Accordingly, agent verification for low impact deployments might be carried out only with informal methods, while for high impact deployments, it may be performed with a mix of formal and informal techniques.

Terminology

In academia, the term agent verification is often defined to mean activity concerned with gaining assurance that the agent behaves in accordance with its specification - whether by processes such as testing or simulation. 'Verification' is typically contrasted with 'validation', the latter meaning activity concerned with checking that the specification itself meets user or real world needs. Such definitions are not universally adhered to however - for example, in some workplaces and documents, the words 'verification' and 'validation' can be used synonymously. Efforts to gain confidence in Agents have intensified sharply since 2025 due to the rapid roll out of LLM agents; different terms are sometimes used in the commercial sector. Here the term 'agent verification' can be used in the same sense as it is in academia, but sometimes the same activity can be covered by more ambiguous and wider ranging terms such as 'Agent governance' , 'Agent observability' or 'AI agent policing'.[1] [2] [3]

History

Classical agents

A classical agent with simple if-then logic

The theoretical underpinnings for artificial (inorganic) agents emerged in the mid 20th century, with establishment of cybernetics and artificial intelligence. Oliver Selfridge's 1958 Pandemonium - A Paradigm for Learning paper was an important early theoretical contribution in establishing agent oriented architecture. Practical implementations of agents for real world applications began to become widespread in the 1990s, after the introduction of the belief–desire–intention software model (BDI), and agent-oriented programming. Pure digital agents were deployed in computer infrastructure for purposes such as monitoring, while agents connected to real-world sensors and actuators were increasingly used in industrial control systems.[1][4][5]

While the concept of artificial agents was interwoven with early artificial intelligence studies right from the start, early agents lacked general purpose reasoning capabilities, often only having simple if then logic. Even a device as simple as a thermostat, which has a sensor and a means of acting, can be considered a proto agent in this sense. Verifying the behaviours of a simple single agent system is not generally especially difficult, but it can be a different matter when several simple agents coexist in the same system. Craig Reynolds's work on boids showed that relatively complex, "intelligent" behaviour can emerge from a number of such simple agents working together in a Multi-agent system (MAS). By the 1990s, even the behaviour of a single agent system could sometimes be quite complex; in accordance with the Belief–desire–intention software model, agents could have believes that might evolve over time. Agents were increasingly introduced that were controlled by quite large decision tree models, which had new vulnerabilities to adversarial attack. It was becoming increasingly apparent that traditional software verification methods had limitations for testing such agents, or even for the more primitive type of agents when they were deployed as part of a MAS.[4][3][1]

It was the use of agents for industrial control systems, sometimes associated with robotics, that lent urgency to the practice of agent verification. Informal testing might be acceptable for digital agents used say to monitor whether each of an organisation's computer is properly licenced. But with an increasing potential for faulty agents to result in a failiure that might cause a large fire to break out at a chemical manufacturing plant, a botched medical operation or even a crashed aircraft, the need to develop reliable means of verifying the behaviour of such agents was considered urgent. The Foundation for Intelligent Physical Agents was established in 1996. In the late 90s, a growing number of industry and university based scientists began working on the problem, with reasearchers publishing papers on the verification of both single and multi agent systems. Much of this work showed how formal verification techniques like model checking could be used to gain a high level of assurance that agent based systems would conform with their specification. A 2018 systematic review covering 231 studies found that model checking was the most common technique for agent verification, with theorem proving the second most commonly used formal verification method. In the first two decades of the 20th century, agents run by AI became more common, with Siri and Alexa being well known examples. But such agents still lacked general reasoning capabilities and did not pose new pressing problems for agent verification.[6][4][3][1]

General purpose reasoning agents

The advent of LLMs created huge potential for further use of artificial agents, as agents based on them could have general purpose cognitive abilities. Agents run by LLMs (and occasionally non-LLM foundation models) have similar vulnerability to adversarial attack as those run by decision tree models. The wider scope of actions for LLM agents has created new challenges for their verification, over and above those present for classical agents. For example, the LLM's neural network endows it with infinite domains, an especial challenge for traditional formal verification techniques. Academics began to study the problems involved in verifying LLM agents from 2018. Deployment of such agents began to accelerate in late 2023 after OpenAI's "function-calling" API was made available, and especially after Anthropic's late 2024 introduction of Model Context Protocol (MCP), a standardised way for LLM agents to gain contextual awareness, and to act on the world by calling various external tools. The rapid rollout of LLM agents following MCP's release has seen the task of agent verification receive increased attention within accademia, and also from the private sector. In 2024 and 2025 several startups focusing on LLM agent verification have been founded in both Europe and the US to meet growing demand.[2][4][3][7]

Approaches

Formal verification

JEdit, a component of the Isabelle theorem prover, one of many tools that can help with the formal verification of agents.

Formal verification involves proving the correctness of some or all aspects of a system using mathematical methods. Such methods can range from manual formal proof, to verfication assisted with automated theorem provers like Isabelle. For agent verification, model checking is by far the most frequently used formal verification method; for pre-LLM models it was often complemented with techniques using computation tree logic. Another common method is theorem proving. Formal verification provides a higher degree of confidence than informal methods, but it is not always used, even when it is possible. Sometimes a person or organisation developing software agents won't have the necessary skills, or may not see it as worth the effort if the agent(s) will not have the ability to cause much harm even if they malfunction. When agents are deployed in systems where errors could have serious consequences, the ability of formal verification methods to provide mathematical certainty tends to be strongly preferred by both regulators and designers themselves. But even for high impact systems, formal verification can be challenging. [3] [1] [6]

Developers can try to design agent based systems in such a way so as to make it easier to formally verify more of their properties. This can be achieved both by specially designing the agent scaffolding of the core cognitive model controlling the agent, and — in the case of LLMs or other Foundational models — by altering the training. An example of the scaffolding orientated approach has been to introduce governor or arbitrator agents, who can veto actions initiated by more complex core decision making model. The governor agents tend to be relatively simple and therefore easier to formally verify. While this approach has been in use since at least 2009, its refinment after 2018 with the LLM-as-a-Judge paradigm has seen it gain in importance. Yet even with these techniques, formal verification is not always achievable. The compute involve may have a prohibitively high cost (as in it might need whole datacentres worth of computers to run for several years.). It may be that the necessary math is very hard to even formulate. For example, even for simple agents in a MAS, if an indeterminate number of agents can join the system at runtime, the problem is generally undecidable, though solutions are available for specific cases. Even more so in the case of LLM agents , aspects of their verification can be fully undecidable by formal mathematics, or at least NP hard. Accordingly, formal verification is often complemented with informal methods.[1][7][8][3]

Identity verfication

The rapid uptake of LLM agents has made confirming their identity a pressing problem, as it can not always be done reliably and at scale with traditional methods. While not always considered a part of formal verification, rigorous implementations of identity verification can provide a comparable level of certainty. The identity verification provider 'Vouched' is an example of a company offering products to assit with LLM agent identity verification, entering this area in 2025.[9][10]

Informal verification

A variety of informal methods can be used to evaluate agents, with software inspection and heuristic evaluation being among the most informal, but often needing much skill on the part of the assessors to achieve a high degree of assurance, except for trivial case where only a small amount of simple code needs to be evaluated. The discipline of software testing includes various other forms of testing that while not requiring the mathematical rigour of formal testing, are often carried out in a relatively structured way, for example functional testing. For agent verification though, a major challenge with traditional informal methods is the difficulty in writing test cases that will cover all the possible scenarios that arise. When different agents must interact to achieve the systems overall goals, the number of permtuations can become extremely high, even if all agents are individually quite simple. Complementing traditional software verification methods with runtime verification[note 1] can further increase confidence that an agent based system conform with specification.[11][1]

Simulation

Agent based simulations or Agent-based models (ABMs) as they are commonly known have existed since the 1980s, though initially they were used largely for modelling macro phenomena like the spread of disease or activity on the financial markets, rather then to study the behaviour of agents themselves. This began to changes in the 1990s, with the Java based JAMES platform being an early example of a simulation being used to verify the behaviour of mobile agents. Such use of simulations was quite niche until Foundational models appeared. In the 2020s, the rapid rollout of LLM agents has increased such use of simulations. While they do not provide the same level of mathematical certainty as a formal proof, they are well suited to meeting some of the challenges associated with verifying LLM agents that are impossible or at least difficult to achieve with formal or other informal verification methods. An example company offering a digital simulation platform for the verification of agent is Conscium, founded by Daniel Hulme and Calum Chace. Their VerifyAX platform allows agent behaviour to be verified against standard benchmarks along with other criteria, such as policy compliance, robustness under adversarial or ambiguous input, and avoidance of behavioural drift over time.[2][12][1]

Tools

  • MCMAS – a symbolic model checker for multi-agent systems supporting temporal, epistemic and strategic logics.[13]
  • MCMAS-OP – an extension of MCMAS for open multi-agent systems where agents may join or leave at runtime. (Open source)[8]
  • AJPF (Agent JPF) – an extension of Java Pathfinder for model checking agent programming languages.[14]
  • VerifyAX – a simulation-based verification platform for AI agents.[2]

See also

Notes

  1. Runtime verification sometimes falls under either formal or informal verification, depending on the specific implementation.

References

Related Articles

Wikiwand AI