The principles that guided the creation of robust and reliable APIs are finding new relevance in the design of Amazon Bedrock Agents. As we transition from traditional software libraries to AI-powered agents, the fundamental concepts of clear contracts, consistency, and orthogonality remain crucial. This blog explores the similarities between designing APIs and Bedrock Agents, highlighting why these time-tested principles are more important than ever in the age of AI.
Agentic is the Shiny Buzzword of the Day
In the roller coaster that is AI these days buzzwords seem to have a lifetime of about a month. The current most hyped buzzword is agentic, which implies a relationship with GenAI agents. Just as almost all software products currently claim to have ai components, most of those components now claim to be agentic. Even the just announced Alexa Plus is described by Amazon as agentic. So what does that even mean?
How has Agentic Infrastructure Evolved?
The evolution of Large Language Model (LLM) interactions has seen rapid advancements in recent years. Initially, developers made direct calls to LLMs, providing prompts and receiving generated responses. This approach, while powerful, was limited by the model's training data cutoff. The introduction of Retrieval-Augmented Generation (RAG) marked a significant leap, allowing LLMs to access external, up-to-date information to enhance their responses. As RAG gained traction, the process was further streamlined with the development of managed knowledge bases and tools, automating the retrieval and integration of relevant information. The latest advancements have introduced sophisticated capabilities like Multi-Agent Orchestration, where multiple specialized AI agents collaborate to tackle complex tasks, and Inline Agents, which offer dynamic, runtime configuration of AI assistants. These developments have dramatically expanded the scope and flexibility of LLM applications, enabling more intelligent, context-aware, and adaptable AI systems.
An AI agent is essentially just a technology system that can decide, act, and learn without constant human interaction - i.e. it is semi or fully autonomous. The system is composed of both ML/AI models and traditional software components. Typically, AI agents are used to complete specific tasks or workflows and/or do analysis and drive decisions that achieve business goals. AI Agents are typically programmed around a specific objective or set of objectives.
Agents can in turn use one or more knowledge bases to access additional possibly specialized knowledge and/or one or more tools that provide access to external APIs.
What’s The Catch?
Whenever a new technology starts to take hold there's a learning process where people have to learn how to design and build it. We're now in that learning time period for designing and building agents.
Consider three pieces of functionality: booking a hotel, ordering room service and having a concierge book theatre tickets. These three functions are closely related and could be packaged in various ways.
- Option 1: a monolith of hand coded Python and LLMs
- Option 2: an LLM calling an agent that has three tools
- Option 3: an LLM that can call one of three agents, each of which has a single tool
Each of these options could generate a working and sustainable system. They would have different characteristics and their maintenance would have different requirements but they could each work. So how do we decide which approach to take?
Learn From The Past
Here is where we can learn from the past by recalling how we used to create software libraries. The current generation of software developers take for granted the existence of a broad range of full features libraries in the language of their choice, be it Java, Python, or even C/C++/C#. But someone had to create those libraries that we now all enjoy. How did they do that?
Software designers traditionally created function libraries as collections of pre-written code to solve common problems and reduce code duplications. These libraries typically contained a set of related functions or classes that could be reused across multiple projects. The process of creating function libraries involved carefully designing and documenting each function to ensure that it had a clear purpose and was easy to use. Designers aimed to make functions modular and reusable following the principle of “don't repeat yourself” (DRY) to minimize boilerplate code. They would often break down complex problems into smaller, manageable pieces with each function handling a specific task.
Socially there was a hierarchy: some developers had the skill to create libraries while others only had the skill to use or consume libraries. These same patterns are repeating themselves in the agentic world.
One of the areas in the most flux for agentic design is cohesion and coupling. Coupling and cohesion are fundamental principles in software design that shape the structure and quality of code. Coupling refers to the degree of interdependence between modules or components, with low coupling being desirable for flexibility and maintainability, while cohesion measures how closely related the responsibilities within a module are, with high cohesion promoting focused and understandable code.
In the agentic world this expresses itself in the three options described earlier. One can have an agent with multiple tools and knowledge bases, or multiple agents each with their own tool or knowledge base. One can even have multiple Supervisor Agents each with access to their own set of agents. As a community we have yet to agree on best practices for these decisions.
Naming
One aspect to keep in mind when comparing software libraries vs agents is who the consumer is. A human examines the documentation of a software library and decides which functions to call. With agents however it is usually another agent (a Supervisor Agent) that scans the agent descriptions to decide which agents are appropriate for a given user request.
When using a software library function a coder makes an explicit call to the given function from within the library. They may get an assist from an AI code helper like Q Developer or Copilot but it is still an explicit line of code calling a particular function.
When using a collection of agents the supervisor agent makes an AI search of the supplied agent descriptions to select which sub-agent to invoke. This is dynamic invocation selection and is implicit rather than explicit. Designing the agent description is a new type of skill but it is just an extension of naming. Donald Knuth described literate programming as a style where the objects (variables) were named like nouns and the functions (methods) were named like verbs. Good method names were readable like “getWindSpeed” or “determineMaxPressure”. These names told you what the function did so that the coder could make an explicit selection.
Agent descriptions on the other hand are sentences like “Agent that generates creative social media content” or “Agent that predicts social media post performance”. These descriptions are input to an LLM that makes a similarity search for the best agent to call. Similarly, tool, action group and knowledge base descriptions are inputs to the agent’s similarity search.
Design Patterns: Conflicting Pressures
A commonality in all types of software design is the notion of competing pressures. Software has various “ilities” (maintainability, debuggability, etc), and it’s rare that one totally overrides another. This is the “art” part of software design, balancing competing pressures.In both function and agent design clear contracts and descriptions are critical. We used to say that functions should do one thing and do it well. Over ten years ago I wrote a blog about this and it's still relevant. In the agentic world this clarity is focused on the agent/tool/knowledge base description but clarity and simplicity are important here as well.
Simplicity and clarity in function libraries exists to make function selection easier. Imagine a library with functions “process_data” and “handle_data”. A developer would have to look at the function details to determine which of these functions (if either) was appropriate for a given situation.
This is also true in the agentic world but performance can also be impacted.. If an agent’s description becomes too complex it can both slow the Supervisor Agent’s selection process and actually introduce errors. Consider the following hypothetical agent code