Getting Started with Agentic AI on AWS

January 29, 2025

Generative AI & LLMOps

Whether you're new to AI agents or looking to optimize your existing solutions, this blog provides valuable insights into everything from Retrieval-Augmented Generation (RAG) and knowledge bases to multi-agent orchestration and practical use cases, helping you make informed decisions about implementing AI agents in your organization.

This blog explores the various components and architectures that have emerged in this rapidly evolving domain, including Retrieval-Augmented Generation (RAG), knowledge bases, and tools. It delves into the intricacies of agent systems, discussing both single-agent and multi-agent approaches, and examines the delicate balance between comprehensive information gathering and token efficiency. The paper also covers key features such as prompt routing, caching, and guardrails, providing insights into their implementation and benefits. By analyzing these developments, we aim to provide a comprehensive overview of the current state of LLM interactions and their potential future directions.

The various options are discussed in turn and then code samples for each example are presented in the Appendix.

How has agentic infrastructure evolved?

The evolution of Large Language Model (LLM) interactions has seen rapid advancements in recent years. Initially, developers made direct calls to LLMs, providing prompts and receiving generated responses. This approach, while powerful, was limited by the model's training data cutoff. The introduction of Retrieval-Augmented Generation (RAG) marked a significant leap, allowing LLMs to access external, up-to-date information to enhance their responses. As RAG gained traction, the process was further streamlined with the development of managed knowledge bases and tools, automating the retrieval and integration of relevant information. The latest advancements have introduced sophisticated capabilities like Multi-Agent Orchestration, where multiple specialized AI agents collaborate to tackle complex tasks, and InlineAgents, which offer dynamic, runtime configuration of AI assistants. These developments have dramatically expanded the scope and flexibility of LLM applications, enabling more intelligent, context-aware, and adaptable AI systems.

As with most designs/architectures in our field, agents are balancing two competing pressures. Pressure one is the desire to have the most complete information so as to be able to generate the most accurate and helpful results. Pressure two is the desire to avoid token explosion (and its attendant higher cost and higher chance of hallucination) by providing too much information to the model. The evolution of agent and multi-agent approaches are attempts to balance these design pressures while simultaneously avoiding excess complexity in the actual software.

Depending on the business problem to solve, the Agent system might be a single agent or be multi-agent (sometimes abbreviated MA). In Multi-Agent Systems a group of agents, each specialized in a specific task, work together to complete a full workflow.

The latest advancements have introduced sophisticated capabilities like Multi-Agent Orchestration, where multiple specialized AI agents collaborate to tackle complex tasks, and InlineAgents, which offer dynamic, runtime configuration of AI assistants. These developments have dramatically expanded the scope and flexibility of LLM applications, enabling more intelligent, context-aware, and adaptable AI systems.

Multi-Agent Orchestration, for example, can have multiple levels: an Orchestrator Agent selecting from multiple Supervisor Agents, each of which is selecting one or more work Agents, each of which might use one or more knowledge bases and/or tools. This level of complexity can lead to difficulty with observability and evaluation.

What is an AI agent?

An AI agent is essentially just a technology system that can decide, act, and learn without constant human interaction - i.e. it is semi or fully autonomous. The system is composed of both ML/AI models and traditional software components. Typically, AI agents are used to complete specific tasks or workflows and/or do analysis and drive decisions that achieve business goals. AI Agents are typically programmed around a specific objective or set of objectives.

Most Caylent customers (although not all) will likely build agents leveraging Amazon Bedrock Agents. An Amazon Bedrock Agent consists of several key components that work together to enable complex, multi-step task execution and intelligent interactions. The main components of an agent are:

Foundation Model (FM)

The agent uses a selected foundation model as its core reasoning engine. This FM is responsible for:

Interpreting user requests
Breaking down tasks into logical steps
Generating responses and follow-up actions

The FM is the component that most people associate with any GenAI system, even though it is actually only one component of such a system.

Prompts

Developers provide instructions that define the agent's purpose and guide its behavior. These instructions act as a prompt to the FM, describing what the agent is designed to do and how it should interact. These instructions can be either a system prompt or a user prompt. System prompts define the general behavior of the system, and are often lengthy and highly detailed, containing comprehensive instructions on how the AI should process information and generate responses. They may include rules for citation, formatting, and even personality traits. System prompts are often not visible to the end user but can be modified when creating a client agent.

User prompts are typically shorter and more straightforward, ranging from simple questions to more complex requests for analysis or content creation. They can vary in complexity but are generally more focused on a specific task or topic.

Action Groups / Tools

An Action Group is a component that defines specific tasks an agent can perform to assist users. It serves as a bridge between the agent's natural language understanding capabilities and the execution of concrete actions or API calls. Action Groups are composed of a set of actions, typically defined using an OpenAPI schema, and an action executor, usually implemented as an AWS Lambda function.

See Action Groups / Tools for more details

Knowledge Bases

Knowledge bases provide additional context and information to supplement the agent's responses. They allow the agent to:

Access and query relevant data sources
Perform Retrieval-Augmented Generation (RAG) to enhance accuracy
Augment responses with domain-specific knowledge

See Knowledge Base for more details

Memory

Agents have both short-term and long-term memory capabilities:

Short-term memory retains detailed information relevant to the current conversation
Long-term memory stores important facts and summaries from previous interactions

Prompt Templates

Customizable prompt templates allow developers to fine-tune the agent's behavior at different stages of its operation, including:

Pre-processing
Orchestration
Knowledge base response generation
Post-processing

By combining these components, Amazon Bedrock Agents can orchestrate complex workflows, interact with enterprise systems, and provide intelligent, context-aware responses to user queries.

Advanced applications for agents

Knowledge Base

An Amazon Bedrock Knowledge Base is a fully managed capability that enables the implementation of Retrieval-Augmented Generation (RAG) workflows for generative AI applications. It serves as a crucial component in enhancing the responses of foundation models (FMs) by providing contextual information from an organization's private data sources. Knowledge Bases are essentially vector databases created from source documents, allowing specialized company-specific data to be made available to the Large Language Model (LLM).

To create a Knowledge Base in Amazon Bedrock, users first need to prepare their data source, which can be unstructured (such as documents in an S3 bucket) or structured (like databases in Amazon Redshift or AWS Glue Data Catalog). When setting up the Knowledge Base, users select an embedding model to convert their data into vector embeddings and choose a vector store to index these embeddings. Amazon Bedrock can automatically create and manage a vector store in Amazon OpenSearch Serverless, simplifying the setup process. The following databases may serve as the vector store for a knowledge base: Amazon OpenSearch Serverless, Amazon Aurora PostgreSQL, MongoDB Atlas, Pinecone and Redis Enterprise Cloud.

Once created, a Knowledge Base can be utilized through various operations provided by Amazon Bedrock. The Retrieve operation allows users to query the Knowledge Base and retrieve relevant information, while the RetrieveAndGenerate operation goes a step further by using the retrieved data to generate appropriate responses. For structured data sources, Amazon Bedrock Knowledge Bases can convert natural language queries into SQL queries, enabling seamless integration with existing data warehouses. This capability extends the reach of generative AI applications to include critical enterprise data without the need for extensive data migration or preprocessing.

The use of Knowledge Bases in Amazon Bedrock offers several advantages for building generative AI applications. It provides a fully managed RAG solution, automating the end-to-end workflow from data ingestion to retrieval and prompt augmentation. This approach not only improves the accuracy and relevance of AI-generated responses but also enhances transparency through source attribution, helping to minimize hallucinations. Furthermore, Knowledge Bases support multimodal data processing, allowing applications to analyze and leverage insights from both textual and visual data, thereby expanding the scope and capabilities of AI-powered solutions.

See Knowledge Base Definition for a sample of creating a Knowledge Base.

Action Groups / Tools

Tools in the context of AI and AI-agentic systems are software that allow the AI system to perform certain defined, deterministic tasks such as interacting with external systems or running scripts. They serve as extensions to the model's capabilities, allowing it to perform tasks such as querying databases, making API calls, or accessing real-time information. Developers provide JSON schemas describing each tool's functionality and input requirements. A tool is basically the OpenAPI definition of a function call.

When creating an Action Group, developers specify the parameters that the agent needs to elicit from users to carry out actions. For example, an Action Group for a hotel booking system might include functions like "CreateBooking," "GetBooking," and "CancelBooking," each with its own set of required parameters such as check-in date, number of nights, or booking reference. The Bedrock agent uses this configuration to determine what information it needs to gather from the user through conversation. Once the necessary details are collected, the agent invokes the associated Lambda function, which contains the business logic to execute the action, such as interacting with backend systems or external APIs. This modular approach allows for flexible and extensible agent capabilities, enabling developers to create sophisticated AI assistants that can perform a wide range of tasks based on natural language inputs.

See Action Group Definition for an example of defining an Action Group.

Prompt Flow

Prompt Flow is a graphical user interface to design the orchestration of agents (similar to the Step Function design tool). A prompt flow consists of a name and description, a set of permissions, a collection of nodes and connections between nodes.

There are the following node types available:

Input Node: Serves as the entry point for the flow, receiving initial data.
Output Node: Acts as the exit point, returning the final result of the flow.
Prompt Node: Defines a prompt to use in the flow, either from Prompt Management or inline.
Knowledge Base Node: Queries a knowledge base to retrieve relevant information.
Agent Node: Utilizes an AI agent to perform complex tasks.
Lambda Function Node: Executes custom logic or interacts with external systems.
S3 Storage Node: Interacts with Amazon S3 for data storage or retrieval.
Condition Node: Directs flow based on specified conditions.
Iterator Node: Applies subsequent nodes iteratively to array elements.

InlineAgent

InlineAgents are a capability to dynamically specify a set of knowledge bases and/or tools to respond to a request. Rather than hand coding a workflow or creating a Prompt Flow one can let an InlineAgent determine what to do.

The InvokeInlineAgent API in Amazon Bedrock determines which knowledge bases and tools to use through a dynamic and intelligent selection process based on the user's input and the agent's configuration. This process allows for flexible and context-aware responses. Here's how it works:

Dynamic Selection

1. Analysis of User Input: The InlineAgents analyzes the user's query to understand the context and requirements.

2. Configuration Evaluation: It evaluates the provided configuration in the API call, which includes:

Action groups
Knowledge bases
Instructions

3. Relevance Matching: The agent matches the query against the available resources to determine which are most relevant.

Selection Criteria

Action Groups: The agent selects appropriate action groups based on the tasks required to fulfill the user's request.
Knowledge Bases: It chooses relevant knowledge bases that contain information pertinent to the query.
Instructions: The agent follows the provided instructions to guide its decision-making process.

Example Scenario

When a user asks about a specific topic:

The agent analyzes the query.
It selects the most appropriate action group (e.g., ClaimManagementActionGroup for a claim-related query).
It chooses the relevant knowledge base (e.g., claims documentation).
The agent configures itself on the fly with the selected tools and knowledge.

This dynamic approach allows the InlineAgents to:

Provide focused and relevant responses
Adapt to different types of queries within a single conversation
Efficiently use resources by only accessing necessary information

By intelligently selecting the right combination of knowledge bases and tools for each query, the InvokeInlineAgent call ensures optimized performance and accuracy in its responses.

Multi-Agent Orchestration

Multi-agent orchestration is an advanced approach to building complex AI systems that leverages multiple specialized agents working together to solve intricate problems and execute multi-step tasks. This collaborative framework enhances the capabilities of individual AI agents by combining their strengths and expertise.

Key Components

Supervisor Agent

A central agent that coordinates the overall workflow by:

Breaking down complex tasks into manageable subtasks
Delegating work to specialized agents
Consolidating outputs from various agents

Specialist Agents

Multiple AI agents with specific areas of expertise, designed to handle particular aspects of a given problem.

Inter-Agent Communication: A standardized protocol allowing agents to exchange information and coordinate their actions efficiently.

Benefits

Enhanced Problem-Solving: Tackles complex, multi-step tasks more effectively than single-agent systems
Improved Accuracy: Combines specialized knowledge from multiple agents
Increased Efficiency: Enables parallel processing of subtasks
Scalability: Allows for the addition of new specialized agents as needed

Related Features

Prompt Routing

Amazon Bedrock Intelligent Prompt Routing is a feature that optimizes the use of foundation models (FMs) within a model family to enhance response quality while managing costs. This capability, currently in preview, offers a single serverless endpoint for efficiently routing requests between different foundational models. As such the name is quite misleading as it sounds very similar to Prompt Flow but is by contrast strictly a cost reduction feature.

Key Features

Dynamic Model Selection: The system predicts the performance of each model for every incoming request, choosing the one that's likely to provide the best response at the lowest cost.
Model Family Support: During the preview phase, users can select from preconfigured routers for the Anthropic and Meta model families.
Cost Optimization: Intelligent Prompt Routing can reduce costs by up to 30% without compromising on accuracy.
Performance Improvement: By leveraging multiple models' strengths, it can enhance overall performance for various tasks.

How It Works

Model Family Selection: Users choose the model family they want to use (e.g., Anthropic's Claude or Meta's Llama).
Request Analysis: For each incoming request, the system predicts the performance of specified models within the chosen family.
Optimal Model Selection: Amazon Bedrock dynamically selects the model predicted to offer the best combination of response quality and cost.
Request Processing: The chosen model processes the request and returns the response.

Prompt Caching

As with all caching systems this feature is based on the notion that some requests will be popular and made multiple times. Since LLM requests can be expensive and slow this is a method to reduce both cost and latency.

Guardrails and Eval/Feedback

Guardrails can be used in multiple ways to help safeguard generative AI applications. For example:

A chatbot application can use guardrails to help filter harmful user inputs and toxic model responses.
A banking application can use guardrails to help block user queries or model responses associated with seeking or providing investment advice.
A call center application to summarize conversation transcripts between users and agents can use guardrails to redact users’ personally identifiable information (PII) to protect user privacy.

Amazon Bedrock Guardrails supports the following policies:

Content filters – Adjust filter strengths to help block input prompts or model responses containing harmful content. Filtering is done based on detection of certain predefined harmful content categories - Hate, Insults, Sexual, Violence, Misconduct and Prompt Attack.
Denied topics – Define a set of topics that are undesirable in the context of your application. The filter will help block them if detected in user queries or model responses.
Word filters – Configure filters to help block undesirable words, phrases, and profanity (exact match). Such words can include offensive terms, competitor names, etc.
Sensitive information filters – Configure filters to help block or mask sensitive information, such as personally identifiable information (PII), or custom regex in user inputs and model responses. Blocking or masking is done based on probabilistic detection of sensitive information in standard formats in entities such as SSN number, Date of Birth, address, etc. This also allows configuring regular expression based detection of patterns for identifiers.
Contextual grounding check – Help detect and filter hallucinations in model responses based on grounding in a source and relevance to the user query.
Image content filter – Help detect and filter inappropriate or toxic image content. Users can set filters for specific categories and set filter strength.

Feedback

In the context of Amazon Bedrock Agents, several feedback mechanisms can be employed to enhance the agent's performance and accuracy. These mechanisms allow for continuous improvement and adaptation of the agent's responses based on various inputs and evaluations.

Prompt Modification

One of the primary feedback mechanisms in Bedrock Agents is prompt modification. This technique involves adjusting the base prompt templates to fine-tune the agent's behavior and responses.

Base Prompt Templates

Bedrock Agents come with four default base prompt templates:

Pre-processing
Orchestration
Knowledge base response generation
Post-processing (disabled by default)

By modifying these templates, developers can enhance the agent's accuracy and tailor its behavior to specific use cases.

Use Case Discussion

Every use case is different but we may be able to generalize a few types of use cases. One way of thinking about use cases is to ask: does the use case require a step by step workflow?

An example of such a workflow might be:

Look up my medical record id
Get a list of my current prescriptions
Refill one of the prescriptions

In this example there are several distinct steps, each of which requires a specialist agent/tool to complete and which need to be completed in a specific order.

Another use case might be a town information retrieval system where a person could ask for building code information, town committee meeting minutes, or the hours for the town dump. In this use case there might be three knowledge bases and given appropriate KB descriptions an InlineAgent could determine which KB to query.

A combined use case might be the above information retrieval scenario followed by a request to apply for a building permit. This case might involve MAO with one of the agents being an InlineAgent and another agent having a Tool to interface with building permit application API.

Another combined use case might be a user asking what was the most expensive AWS service they were using followed by a request for recommendations to reduce that cost. This might be an MAO calling a tool to get customer specific pricing information followed by an InlineAgent knowledge base search for service specific recommendations.

At a certain level there will need to be business logic someplace in the application. That logic can live in raw Python code, in the description of your agents, knowledge bases and tools, and/or in your design of an Orchestration/Supervisor/Agent hierarchy. It could even be in the creation of a number of distinct applications or APIs, i.e. the concept of a library of functions that a user or programmer stitched together.

A corollary of this point is that just as an organization needs to have clean data before implementing GenAI, it also needs to determine what its business logic is or should be.

How Caylent Can Help

Do you want to evaluate Agentic AI use cases for your organization? Caylent's experts can help you navigate the complexities of AI implementation, from selecting the right models, deploying scalable architecture and building custom solutions with Amazon Bedrock and AWS's AI suite. Contact us today to explore how you can deploy AI systems that deliver real business value with innovative new capabilities while maintaining cost efficiency.

Appendix A Code Samples

Simple Invocation

import boto3
import json

# Create a Bedrock Runtime client
bedrock_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')

# Set up the request parameters
agent_id = 'your_agent_id'
agent_alias_id = 'your_agent_alias_id'
session_id = 'unique_session_id'
input_text = 'Your question or prompt here'

# Invoke the agent
response = bedrock_runtime.invoke_agent(
    agentId=agent_id,
    agentAliasId=agent_alias_id,
    sessionId=session_id,
    inputText=input_text
)

# Parse and print the response
completion = json.loads(response['completion'])
print(completion['content'])

Invoke an Agent With a Knowledge Base

import boto3
import json
import uuid

# Create Bedrock clients
bedrock = boto3.client('bedrock')
bedrock_runtime = boto3.client('bedrock-runtime')

def create_simple_agent():
    # Create a knowledge base
    kb_response = bedrock.create_knowledge_base(
        name="SimpleKnowledgeBase",
        description="A simple knowledge base for our agent",
        roleArn="arn:aws:iam::YOUR_ACCOUNT_ID:role/YOUR_ROLE_NAME"
    )
    knowledge_base_id = kb_response['knowledgeBaseId']

    # Create an agent
    agent_response = bedrock.create_agent(
        name="SimpleAgent",
        description="A simple Bedrock agent",
        instruction="You are a helpful assistant that provides information about AWS services.",
        knowledgeBaseId=knowledge_base_id,
        foundationModel="anthropic.claude-v2",
        idleSessionTTLInSeconds=300,
        roleArn="arn:aws:iam::YOUR_ACCOUNT_ID:role/YOUR_ROLE_NAME"
    )
    agent_id = agent_response['agent']['agentId']

    # Create an agent alias
    alias_response = bedrock.create_agent_alias(
        agentId=agent_id,
        name="SimpleAgentAlias"
    )
    agent_alias_id = alias_response['agentAlias']['agentAliasId']

    return agent_id, agent_alias_id

def invoke_agent(agent_id, agent_alias_id, input_text):
    session_id = str(uuid.uuid4())
    response = bedrock_runtime.invoke_agent(
        agentId=agent_id,
        agentAliasId=agent_alias_id,
        sessionId=session_id,
        inputText=input_text
    )
    return json.loads(response['completion'])['content']

# Create the agent
agent_id, agent_alias_id = create_simple_agent()

# Invoke the agent
input_text = "What is Amazon S3?"
result = invoke_agent(agent_id, agent_alias_id, input_text)
print(result)

Invoking the InlineAgent API - with external definition of tools

There are two Python files. The first is the main program which invokes the InlineAgent API. The second file defines the action groups and tools being supplied to the api.

import boto3
import json
from action_groups import ACTION_GROUPS

# Initialize the Bedrock Agent Runtime client
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime')

# Define two knowledge bases
knowledge_bases = [
    {
        "knowledgeBaseId": "company_policies_kb",
        "instruction": "Use this knowledge base for information about company policies and procedures."
    },
    {
        "knowledgeBaseId": "product_catalog_kb",
        "instruction": "Use this knowledge base for information about our product catalog and specifications."
    }
]

# Function to invoke the inline agent
def invoke_inline_agent(input_text):
    response = bedrock_agent_runtime.invoke_inline_agent(
        sessionId='unique-session-id',
        input={
            'text': input_text
        },
        actionGroups=ACTION_GROUPS,
        knowledgeBases=knowledge_bases,
        enableTrace=True,
        endSession=False,
        foundationModel='anthropic.claude-v2',
        instruction="You are a helpful assistant for our company. Use the provided knowledge bases and tools to answer questions about company policies, products, employees, and inventory.",
    )
    
    return response

# Example usage
user_input = "What's the current inventory of product XYZ-123?"
response = invoke_inline_agent(user_input)

# Process and print the response
print(json.dumps(response, indent=2))

# action_groups.py

ACTION_GROUPS = [
    {
        "actionGroupName": "EmployeeInfoLookup",
        "actionGroupExecutor": {
            "lambda": {
                "arn": "arn:aws:lambda:us-west-2:123456789012:function:employee-lookup"
            }
        },
        "description": "Look up employee information",
        "apiSchema": {
            "openapi": "3.0.0",
            "info": {"title": "EmployeeInfoLookup", "version": "1.0.0"},
            "paths": {
                "/lookup": {
                    "get": {
                        "summary": "Look up employee information",
                        "parameters": [
                            {
                                "name": "employeeId",
                                "in": "query",
                                "required": True,
                                "schema": {"type": "string"}
                            }
                        ],
                        "responses": {
                            "200": {
                                "description": "Successful response",
                                "content": {
                                    "application/json": {
                                        "schema": {
                                            "type": "object",
                                            "properties": {
                                                "name": {"type": "string"},
                                                "department": {"type": "string"},
                                                "position": {"type": "string"}
                                            }
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    },
    {
        "actionGroupName": "ProductInventoryCheck",
        "actionGroupExecutor": {
            "lambda": {
                "arn": "arn:aws:lambda:us-west-2:123456789012:function:inventory-check"
            }
        },
        "description": "Check product inventory",
        "apiSchema": {
            "openapi": "3.0.0",
            "info": {"title": "ProductInventoryCheck", "version": "1.0.0"},
            "paths": {
                "/check": {
                    "get": {
                        "summary": "Check product inventory",
                        "parameters": [
                            {
                                "name": "productId",
                                "in": "query",
                                "required": True,
                                "schema": {"type": "string"}
                            }
                        ],
                        "responses": {
                            "200": {
                                "description": "Successful response",
                                "content": {
                                    "application/json": {
                                        "schema": {
                                            "type": "object",
                                            "properties": {
                                                "productName": {"type": "string"},
                                                "quantity": {"type": "integer"},
                                                "location": {"type": "string"}
                                            }
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
]

MultiAgent Orchestration

Agent Definition file

# agent_definitions.py

AGENTS = [
    {
        "agentName": "ValidationAgent",
        "instruction": "Use this agent to validate user input.",
        "knowledgeBaseId": "validation_kb"
    },
    {
        "agentName": "ProcessingAgent",
        "instruction": "Use this agent to process data.",
        "knowledgeBaseId": "processing_kb"
    },
    {
        "agentName": "ReportingAgent",
        "instruction": "Use this agent to generate reports.",
        "knowledgeBaseId": "reporting_kb"
    }
]

Main Program

# main.py
import boto3
from agent_definitions import AGENTS

def orchestrate_agents():
    # Initialize the Bedrock Agent Runtime client
    bedrock_agent_runtime = boto3.client('bedrock-agent-runtime')

    # Define the steps and the corresponding agents
    steps = [
        {"step": "Step 1: Validate user input", "agent": "ValidationAgent", "input": "Validate the input data."},
        {"step": "Step 2: Process data", "agent": "ProcessingAgent", "input": "Process the validated data."},
        {"step": "Step 3: Generate report", "agent": "ReportingAgent", "input": "Generate a report based on the processed data."}
    ]

    # Execute each step in order
    for step in steps:
        print(f"Executing {step['step']} using {step['agent']}...")
        response = bedrock_agent_runtime.invoke_multi_agent(
            sessionId='unique-session-id',
            input={
                'text': step['input']
            },
            agents=AGENTS,
            enableTrace=True,
            endSession=False,
            foundationModel='anthropic.claude-v2',
            instruction=f"Use {step['agent']} to complete this step."
        )
        print(f"Response from {step['agent']}: {response}")

if __name__ == "__main__":
    orchestrate_agents()

Orchestration Done Completely Manually

This code shows that one can perform agent orchestration completely manually using simple Python without using an specialized agents or frameworks. While this code appears quite simple it is also very fragile. Any change in the orchestration requires changes to the python code. Any additional or conditional changes require a rewrite. This is an example of easy demo-ware that likely does not scale.

import boto3

def call_agent(agent_name, input_text):
    bedrock_agent_runtime = boto3.client('bedrock-agent-runtime')
    
    response = bedrock_agent_runtime.invoke_agent(
        agentName=agent_name,
        sessionId='unique-session-id',
        input={
            'text': input_text
        },
        enableTrace=True,
        endSession=False
    )
    
    return response['output']

def orchestrate_agents():
    # Step 1: Validation
    validation_input = "Validate the input data."
    validation_result = call_agent('ValidationAgent', validation_input)
    print("Validation Result:", validation_result)

    # Step 2: Processing
    processing_input = f"Process the following validated data: {validation_result}"
    processing_result = call_agent('ProcessingAgent', processing_input)
    print("Processing Result:", processing_result)

    # Step 3: Reporting
    reporting_input = f"Generate a report based on the processed data: {processing_result}"
    report = call_agent('ReportingAgent', reporting_input)
    print("Final Report:", report)

if __name__ == "__main__":
    orchestrate_agents()

Implementing a Feedback Mechanism

Here is the main program,

def main():
    bedrock_client = BedrockClient()
    knowledge_base = KnowledgeBase()
    feedback_system = FeedbackSystem()
    model_retrainer = ModelRetrainer(bedrock_client)

    agent_id = "your_agent_id"
    agent_alias_id = "your_agent_alias_id"
    session_id = "unique_session_id"

    while True:
        user_input = input("Enter your question (or 'quit' to exit): ")
        if user_input.lower() == 'quit':
            break

        response = bedrock_client.invoke_agent(agent_id, agent_alias_id, session_id, user_input)
        print("Agent response:", response)

        user_rating = int(input("Rate the response (1-5): "))
        feedback_system.collect_feedback(user_input, response, user_rating)

        if len(feedback_system.get_feedback_data()) >= 10:
            model_retrainer.retrain_model(feedback_system.get_feedback_data())
            feedback_system = FeedbackSystem()  # Reset feedback after retraining

if __name__ == "__main__":
    main()

Utility functions

import boto3

class BedrockClient:
    def __init__(self, region_name="us-east-1"):
        self.runtime_client = boto3.client("bedrock-runtime", region_name=region_name)
        self.agent_client = boto3.client("bedrock-agent", region_name=region_name)

    def invoke_agent(self, agent_id, agent_alias_id, session_id, input_text):
        response = self.runtime_client.invoke_agent(
            agentId=agent_id,
            agentAliasId=agent_alias_id,
            sessionId=session_id,
            inputText=input_text
        )
        return response['completion']

class KnowledgeBase:
    def __init__(self):
        self.data = {}

    def add_entry(self, key, value):
        self.data[key] = value

    def get_entry(self, key):
        return self.data.get(key)

class ModelRetrainer:
    def __init__(self, bedrock_client):
        self.bedrock_client = bedrock_client

    def retrain_model(self, feedback_data):
        # This is a simplified representation. In practice, you would use
        # Bedrock's APIs to update the model or knowledge base.
        print("Retraining model with feedback data...")
        # Implement logic to update the model or knowledge base

Creating a Robust Guardrail

create_response = client.create_guardrail(
    name='fiduciary-advice',
    description='Prevents the our model from providing fiduciary advice.',
    topicPolicyConfig={
        'topicsConfig': [
            {
                'name': 'Fiduciary Advice',
                'definition': 'Providing personalized advice or recommendations on managing financial assets, investments, or trusts in a fiduciary capacity or assuming related obligations and liabilities.',
                'examples': [
                    'What stocks should I invest in for my retirement?',
                    'Is it a good idea to put my money in a mutual fund?',
                    'How should I allocate my 401(k) investments?',
                    'What type of trust fund should I set up for my children?',
                    'Should I hire a financial advisor to manage my investments?'
                ],
                'type': 'DENY'
            }
        ]
    },
    contentPolicyConfig={
        'filtersConfig': [
            {
                'type': 'SEXUAL',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'VIOLENCE',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'HATE',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'INSULTS',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'MISCONDUCT',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'PROMPT_ATTACK',
                'inputStrength': 'HIGH',
                'outputStrength': 'NONE'
            }
        ]
    },
    wordPolicyConfig={
        'wordsConfig': [
            {'text': 'fiduciary advice'},
            {'text': 'investment recommendations'},
            {'text': 'stock picks'},
            {'text': 'financial planning guidance'},
            {'text': 'portfolio allocation advice'},
            {'text': 'retirement fund suggestions'},
            {'text': 'wealth management tips'},
            {'text': 'trust fund setup'},
            {'text': 'investment strategy'},
            {'text': 'financial advisor recommendations'}
        ],
        'managedWordListsConfig': [
            {'type': 'PROFANITY'}
        ]
    },
    sensitiveInformationPolicyConfig={
        'piiEntitiesConfig': [
            {'type': 'EMAIL', 'action': 'ANONYMIZE'},
            {'type': 'PHONE', 'action': 'ANONYMIZE'},
            {'type': 'NAME', 'action': 'ANONYMIZE'},
            {'type': 'US_SOCIAL_SECURITY_NUMBER', 'action': 'BLOCK'},
            {'type': 'US_BANK_ACCOUNT_NUMBER', 'action': 'BLOCK'},
            {'type': 'CREDIT_DEBIT_CARD_NUMBER', 'action': 'BLOCK'}
        ],
        'regexesConfig': [
            {
                'name': 'Account Number',
                'description': 'Matches account numbers in the format XXXXXX1234',
                'pattern': r'\b\d{6}\d{4}\b',
                'action': 'ANONYMIZE'
            }
        ]
    },
    contextualGroundingPolicyConfig={
        'filtersConfig': [
            {
                'type': 'GROUNDING',
                'threshold': 0.75
            },
            {
                'type': 'RELEVANCE',
                'threshold': 0.75
            }
        ]
    },
    blockedInputMessaging="""I can provide general info about Acme Financial's products and services, but can't fully address your request here. For personalized help or detailed questions, please contact our customer service team directly. For security reasons, avoid sharing sensitive information through this channel. If you have a general product question, feel free to ask without including personal details. """,
    blockedOutputsMessaging="""I can provide general info about Acme Financial's products and services, but can't fully address your request here. For personalized help or detailed questions, please contact our customer service team directly. For security reasons, avoid sharing sensitive information through this channel. If you have a general product question, feel free to ask without including personal details. """,
    tags=[
        {'key': 'purpose', 'value': 'fiduciary-advice-prevention'},
        {'key': 'environment', 'value': 'production'}
    ]
)

Knowledge Base Definition

import boto3

# Initialize the Bedrock client
bedrock_client = boto3.client('bedrock-agent')

# Define the knowledge base configuration
knowledge_base_config = {
    'name': 'MyKnowledgeBase',
    'description': 'A knowledge base for storing company information',
    'roleArn': 'arn:aws:iam::123456789012:role/BedrockKnowledgeBaseRole',
    'knowledgeBaseConfiguration': {
        'type': 'VECTOR',
        'vectorKnowledgeBaseConfiguration': {
            'embeddingModelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v1'
        }
    },
    'storageConfiguration': {
        'type': 'OPENSEARCH_SERVERLESS',
        'opensearchServerlessConfiguration': {
            'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/my-collection',
            'vectorIndexName': 'my-vector-index',
            'fieldMapping': {
                'vectorField': 'embedding',
                'textField': 'text',
                'metadataField': 'metadata'
            }
        }
    }
}

# Create the knowledge base
response = bedrock_client.create_knowledge_base(**knowledge_base_config)

# Print the knowledge base ID
print(f"Knowledge Base ID: {response['knowledgeBaseId']}")

Action Group Definition

{
  "actionGroupName": "BookHotel",
  "description": "Helps users manage hotel bookings",
  "actionGroupExecutor": {
    "lambdaExecutor": "arn:aws:lambda:us-east-1:123456789012:function:HotelBookingExecutor"
  },
  "schemaDefinition": {
    "inlineAPISchema": {
      "openapi": "3.0.0",
      "info": {
        "title": "Hotel Booking API",
        "version": "1.0.0"
      },
      "paths": {
        "/createBooking": {
          "post": {
            "summary": "Create a new hotel booking",
            "operationId": "createBooking",
            "requestBody": {
              "required": true,
              "content": {
                "application/json": {
                  "schema": {
                    "type": "object",
                    "properties": {
                      "hotelName": {
                        "type": "string",
                        "description": "Name of the hotel"
                      },
                      "checkInDate": {
                        "type": "string",
                        "format": "date",
                        "description": "Check-in date (YYYY-MM-DD)"
                      },
                      "numberOfNights": {
                        "type": "integer",
                        "description": "Number of nights to stay"
                      },
                      "guestName": {
                        "type": "string",
                        "description": "Name of the guest"
                      }
                    },
                    "required": ["hotelName", "checkInDate", "numberOfNights", "guestName"]
                  }
                }
              }
            },
            "responses": {
              "200": {
                "description": "Booking created successfully",
                "content": {
                  "application/json": {
                    "schema": {
                      "type": "object",
                      "properties": {
                        "bookingId": {
                          "type": "string",
                          "description": "Unique identifier for the booking"
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

Generative AI & LLMOps

Brian Tarbox

Brian is an AWS Community Hero, Alexa Champion, runs the Boston AWS User Group, has ten US patents and a bunch of certifications. He's also part of the New Voices mentorship program where Heros teach traditionally underrepresented engineers how to give presentations. He is a private pilot, a rescue scuba diver and got his Masters in Cognitive Psychology working with bottlenosed dolphins.

View Brian's articles

Learn more about the services mentioned

Caylent Catalysts™

Generative AI Strategy

Accelerate your generative AI initiatives with ideation sessions for use case prioritization, foundation model selection, and an assessment of your data landscape and organizational readiness.

Caylent Catalysts™

Generative AI Knowledge Base

Learn how to improve customer experience and with custom chatbots powered by generative AI.

Caylent Services

Artificial Intelligence & MLOps

Apply artificial intelligence (AI) to your data to automate business processes and predict outcomes. Gain a competitive edge in your industry and make more informed decisions.

Caylent Catalysts™

AWS Generative AI Proof of Value

Accelerate investment and mitigate risk when developing generative AI solutions.

Accelerate your GenAI initiatives

Leveraging our accelerators and technical experience

Browse GenAI Offerings

Jumpstart Your AWS Cloud Migration

Learn how small and medium businesses seeking faster, more predictable paths to AWS adoption can leverage Caylent's SMB Migration Quick Start to overcome resource constraints, reduce risk, and achieve cloud readiness in as little as seven weeks.

Migrations

Generative AI & LLMOps

October 17, 2025

Evolving MultiAgentic Systems

Explore how organizations can evolve their agentic AI architectures from complex multi-agent systems to streamlined, production-ready designs that deliver greater performance, reliability, and efficiency at scale.

Generative AI & LLMOps

October 15, 2025

Claude Haiku 4.5 Deep Dive: Cost, Capabilities, and the Multi-Agent Opportunity

Explore the newly launched Claude Haiku 4.5, Anthropic's first Haiku model to include extended thinking, computer use, and context awareness capabilities.

Generative AI & LLMOps

View all blog posts

How has agentic infrastructure evolved?

What is an AI agent?

Foundation Model (FM)

Prompts

Action Groups / Tools

Knowledge Bases

Memory

Prompt Templates

Advanced applications for agents

Knowledge Base

Action Groups / Tools

Prompt Flow

InlineAgent

Dynamic Selection

Selection Criteria

Example Scenario

Multi-Agent Orchestration

Key Components

Supervisor Agent

Specialist Agents

Benefits

Related Features

Prompt Routing

Key Features

How It Works

Prompt Caching

Guardrails and Eval/Feedback

Feedback

Prompt Modification

Base Prompt Templates

Use Case Discussion

How Caylent Can Help

Appendix A Code Samples

Simple Invocation

Invoke an Agent With a Knowledge Base

Invoking the InlineAgent API - with external definition of tools

MultiAgent Orchestration

Agent Definition file

Main Program

Orchestration Done Completely Manually

Implementing a Feedback Mechanism

Utility functions

Creating a Robust Guardrail

Knowledge Base Definition

Action Group Definition

Brian Tarbox

Learn more about the services mentioned

Generative AI Strategy

Generative AI Knowledge Base

Artificial Intelligence & MLOps

AWS Generative AI Proof of Value

Accelerate your GenAI initiatives

Related Blog Posts

Jumpstart Your AWS Cloud Migration

Evolving MultiAgentic Systems

Claude Haiku 4.5 Deep Dive: Cost, Capabilities, and the Multi-Agent Opportunity