Val Henderson Appointed to CEO

Building a Secure RAG Application with Amazon Bedrock AgentCore + Terraform

Generative AI & LLMOps

Learn how to build and deploy a secure, scalable RAG chatbot using Amazon Bedrock AgentCore Runtime, Terraform, and managed AWS services.

As AI rapidly moves from experimentation to production, teams face an increasing number of architectural decisions that directly shape long-term outcomes. While most decisions are reversible "two-way doors" open to iteration, your infrastructure choice is a one-way door that defines how far your application can scale, how securely it operates, and how reliably it serves users. Enter Amazon Bedrock AgentCore, AWS's new foundation for building and deploying intelligent agents at production scale.

Before diving in, it’s worth clarifying an important distinction: Amazon Bedrock AgentCore is not responsible for your agent’s orchestration or business logic. Instead, it provides the managed runtime infrastructure that executes that logic – handling scaling, isolation, networking, and security so you can focus on how your agent thinks and behaves.

In this hands-on guide, we'll deploy an AI agent to AgentCore Runtime to orchestrate a Retrieval-Augmented Generation (RAG) chatbot workflow with user authentication and streamed responses. This hands-on example will illustrate core architectural concepts for building and operating AI agents on AWS, highlighting several important considerations:

  • Security: JWT-based authentication ensures only authorized users have access to your agent
  • Scalability: Amazon Bedrock AgentCore's serverless nature automatically handles traffic spikes
  • Cost Efficiency: Pay only for active compute time, not idle resources
  • Observability: Built-in logging and session management for debugging
  • Maintainability: Terraform enables consistent deployments across environments

Along the way, we'll also explore the trade-offs involved in architectural decisions for production-level chatbots.

Why Amazon Bedrock AgentCore?

Amazon Bedrock AgentCore is an agentic platform for building, deploying, and operating AI agents securely at scale across any framework, model, or protocol, with no infrastructure management required. Its modular services include Runtime, Memory, Gateway, Identity, Browser, and Observability, which you can use together or independently for your agent workloads.

At the heart of this platform is AgentCore Runtime, a secure, serverless execution environment purpose-built to host and scale AI agents and tools without requiring the provisioning or tuning of compute resources.

AgentCore Runtime’s serverless model lets you focus on your agent’s logic instead of infrastructure. There is no need to configure auto-scaling groups, monitor CPU or memory metrics, or reserve capacity in advance. The service automatically scales based on load, provides session isolation and extended runtimes, and abstracts away the undifferentiated heavy lifting of agent hosting.

You also pay only for the active resources you consume. Idle time spent waiting for large language model responses or external context retrieval is not counted toward the final cost. Compared with services that charge for pre-allocated resources, such as Amazon EC2 or Amazon ECS, this model can significantly reduce overall CPU costs for agent-based applications.

Beyond AgentCore Runtime, Amazon Bedrock AgentCore offers several other services that can facilitate AI agent development and integrate with AgentCore Runtime, such as: 

  • AgentCore Memory: Offers both long-term and short-term memory for conversation and contextual history
  • AgentCore Gateway: A secure way to build, deploy, discover, and connect to tools at scale
  • AgentCore Identity: An identity and credential management service designed specifically for AI agents and automated workloads

If you are using Amazon Cognito User Pools to authenticate users, you can integrate JWT Token Authentication to secure your application. Learn more about Amazon Bedrock AgentCore here.

Architecture Decisions for Production RAG

Knowledge Base & Vector Store

Production RAG systems must support reliable semantic search, frequent document updates, and secure access to knowledge sources, while keeping operational complexity low. The vector store should scale with growing data volumes without requiring ongoing infrastructure management.

For this implementation, we're using Amazon Bedrock Knowledge Base with Amazon S3 Vectors as the vector store. This combination offers several advantages:

  • Managed Service Benefits: Using a managed knowledge base reduces operational burden by handling document chunking, embedding generation, and retrieval APIs out of the box, while still allowing flexibility by supporting multiple vector store backends, such as Amazon OpenSearch, Amazon Aurora PostgreSQL, Pinecone, and others, as architectural requirements evolve.
  • Amazon S3 Vectors: A newer option that provides a simple, cost-effective, and performant vector store for small-to-medium RAG applications. For an in-depth analysis, check out our blog on Amazon S3 Vectors.
  • Amazon Bedrock Data Automation (BDA): Automatically handles multimodal document ingestion without manual parsing/chunking configurations. (Note: Currently not available in all regions - this tutorial uses us-east-1.)

Embedding Model

To perform the type of semantic search required for RAG, we first use a specialized model to compute vector embeddings for the documents we want to retrieve, and then store those embeddings in a vector database. These embeddings allow user queries to be compared based on meaning rather than exact keyword matches.

Amazon Titan Text Embeddings is AWS’s native embedding model family designed for high-quality semantic retrieval across a wide range of text workloads. For this implementation, we’re using Amazon Titan Text Embeddings V2 due to its improved retrieval performance, larger token input size of up to 8,192 tokens, and lower cost compared to alternatives such as Cohere’s embedding models.

Chunking Strategy

Amazon Bedrock Knowledge Base offers multiple chunking strategies, including standard, hierarchical, semantic, and multimodal.

For RAG applications, semantic chunking is typically the best fit. Instead of splitting documents based on layout or fixed token counts, it groups content by meaning, which improves retrieval accuracy and helps ensure the model receives context that is actually relevant to the user’s question. This deeper semantic alignment is especially important for conversational workloads, and it’s why AWS recommends semantic chunking as the default approach for RAG-based chatbots.

LLM Model

Anthropic Claude Haiku 4.5 is a lightweight, high-performance foundation model designed for low-latency, cost-efficient conversational and agentic workloads. It strikes a strong balance between speed, reasoning capability, and operational cost, making it well-suited for production chatbot deployments.

We're using Anthropic Claude Haiku 4.5 as the foundation model because it delivers near-frontier performance at a fraction of the cost of larger models like Sonnet. It performs particularly well for chatbots that require fast responses, reliable reasoning, and consistent use of agentic tools. To learn more, read about our deep dive into Claude Haiku 4.5

Future Considerations: As LLM models evolve, you may need to update your model to leverage improved reasoning, higher-quality training data, and better tool-use capabilities. 

Hands-On Tutorial

Prerequisites.

To deploy your AI agent, you'll need:

  • An AWS account with permissions for Amazon Bedrock, Amazon Bedrock AgentCore, and Amazon S3
  • AWS credentials loaded via aws configure sso or environment variables
  • Python v3.12+
  • Terraform v1.14.3+
  • Docker
  • Code editor (VS Code, Cursor, etc.)

Clone the repository: https://github.com/caylent/agentcore-blog

This tutorial creates resources in us-east-1 to leverage Bedrock Data Automation.

Agent Code Overview

The main entry point for the agent invocation is agent/app.py:

@app.entrypoint
def invoke_agent(payload):
    ...
    try:
        for chunk, metadata in agent.stream(
            initial_state,
            stream_mode="messages",
        ):
            if metadata.get("langgraph_node") == "generate_answer":
                yield from __process_stream_chunk(chunk)
    except Exception as exc:
        app.logger.error("Streaming agent response failed")
        yield {
            "type": "error",
            "text": "Something went wrong while streaming the response.",
            "error_details": str(exc),
        }
        return

The agent accepts a payload with user input and conversation history:

{
    "prompt": "Hello!",  
    "conversation_history": []
}

Agent Orchestration Logic

Agent orchestration logic defines how the agent reasons about user input, selects tools, and determines when to retrieve external knowledge. This logic is implemented entirely within the application code and is not managed by Amazon Bedrock AgentCore. AgentCore provides the execution environment for the agent, but the orchestration flow remains your responsibility.

In this implementation, the agent decides whether to call the knowledge base retriever tool or respond directly. This decision is guided by system prompts defined in agent/prompts.py:

  1. Direct Response: For greetings or simple confirmations, the agent generates an immediate answer.
  2. Tool Invocation: For knowledge-based queries, the retriever tool (agent/RetrieverTool.py) is invoked with an LLM-generated query. Retrieved documents are added as context for the final response.

The orchestration graph is defined in agent/RetrieverAgent.py:

class RetrieverAgent:
   ...
   def get_agent_graph(self):
        workflow = StateGraph(AgentState)

        workflow.add_node("generate_query", self.__generate_query)
        workflow.add_node("retrieve", ToolNode([knowledge_base_retriever]))
        workflow.add_node("generate_answer", self.__generate_answer)

        workflow.add_edge(START, "generate_query")
        workflow.add_conditional_edges(
            "generate_query",
            tools_condition,
            {
                "tools": "retrieve",
                END: "generate_answer",
            },
        )
        workflow.add_edge("retrieve", "generate_answer")
        workflow.add_edge("generate_answer", END)

        return workflow.compile()

Terraform Setup

Step 1: Initialize Configuration

cp infra/example.tfvars infra/terraform.tfvars

Fill out the values in terraform.tfvars:

# infra/terraform.tfvars 
 
region  = "us-east-1" 
profile = "" # AWS profile name from 'aws configure sso', leave empty for env variables 
tags = { 
  project = "agentcore-test" 
}
ecr_repository_name = "" # unique name for the ecr repository
...

Step 2: Initialize Terraform

cd infra
terraform init

This initializes:

  • AWS provider (v6.28.0+): Supports Amazon S3 Vectors and Amazon Bedrock AgentCore
  • AWSCC provider (v1.68.0+): Creates data sources with Amazon Bedrock Data Automation
# infra/providers.tf

terraform {
  required_version = ">= 1.14.3"
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = ">= 6.28.0"
    }
    awscc = {
      source  = "hashicorp/awscc"
      version = ">= 1.68.0"
    }
  }
}

Knowledge Base Setup

The repository includes two sample documents at kb/. Upload these to an Amazon S3 bucket, then update terraform.tfvars:

# infra/terraform.tfvars
...
data_source_bucket_arn   = "arn:aws:s3:::<name-of-your-s3-bucket>"

Deployment

Now you are ready to deploy the infrastructure to AWS:

terraform plan # Verify configuration
terraform apply # Approve deployment

Key Infrastructure Components

Knowledge Base (bedrock_kb.tf)

Defines the knowledge base, data source, and Amazon S3 vector index:

...

resource "aws_s3vectors_vector_bucket" "vector_bucket" {
  vector_bucket_name = "agentcore-test-vector-bucket"
}

resource "aws_s3vectors_index" "vector_index" {
  index_name         = "agentcore-test-vector-index"
  vector_bucket_name = aws_s3vectors_vector_bucket.vector_bucket.vector_bucket_name

  data_type       = "float32"
  dimension       = 1024
  distance_metric = "euclidean"

  metadata_configuration {
    non_filterable_metadata_keys = [
      "AMAZON_BEDROCK_TEXT",
      "AMAZON_BEDROCK_METADATA"
    ]
  }
}

resource "aws_s3_bucket" "multimodal_output_bucket" {
  bucket = "agentcore-test-multimodal-output-bucket"
  force_destroy = true
}

resource "aws_bedrockagent_knowledge_base" "knowledge_base" {
  name        = "agentcore-test-knowledge-base"
  description = "Test knowledge base for AgentCore"
  role_arn    = aws_iam_role.bedrock_kb_role.arn
  knowledge_base_configuration {
    type = "VECTOR"
    vector_knowledge_base_configuration {
      embedding_model_arn = "arn:aws:bedrock:${var.region}::foundation-model/amazon.titan-embed-text-v2:0"
      embedding_model_configuration {
        bedrock_embedding_model_configuration {
          dimensions          = 1024
          embedding_data_type = "FLOAT32"
        }
      }
      supplemental_data_storage_configuration {
        storage_location {
          type = "S3"

          s3_location {
            uri = "s3://${aws_s3_bucket.multimodal_output_bucket.bucket}"
          }
        }
      }
    }
  }
  storage_configuration {
    type = "S3_VECTORS"
    s3_vectors_configuration {
      index_arn = aws_s3vectors_index.vector_index.index_arn

    }
  }
}

resource "awscc_bedrock_data_source" "s3_data_source" {
  knowledge_base_id = aws_bedrockagent_knowledge_base.knowledge_base.id
  name              = "agentcore-test-s3-data-source"
  description       = "Data source for the Amazon Bedrock Knowledge Base: agentcore-test-knowledge-base from S3 with semantic chunking"
  data_source_configuration = {
    s3_configuration = {
      bucket_arn = var.data_source_bucket_arn
    }
    type = "S3"
  }
  vector_ingestion_configuration = {
    chunking_configuration = {
      chunking_strategy = "SEMANTIC"
      semantic_chunking_configuration = {
        breakpoint_percentile_threshold = 95
        buffer_size                     = 0 # either 0 or 1
        max_tokens                      = 300
      }
    }
    parsing_configuration = {
      parsing_strategy = "BEDROCK_DATA_AUTOMATION"
      bedrock_data_automation_configuration = {
        parsing_modality = "MULTIMODAL"
      }
    }
  }
}

Note: The data source uses the awscc provider to support Amazon Bedrock Data Automation for multimodal parsing.

Cognito Authentication (cognito.tf)

Creates an Amazon Cognito user pool and client for Amazon Bedrock AgentCore Runtime authorization.

AgentCore Runtime (agentcore_runtime.tf)

First, create an Amazon ECR repository and push an initial image:

resource "aws_ecr_repository" "agentcore_runtime_agent_code_ecr_repository" {
  name         = "agentcore-test-runtime-agent-code-ecr-repository"
  force_delete = true
}

resource "null_resource" "push_initial_image" {
  depends_on = [aws_ecr_repository.agentcore_runtime_agent_code_ecr_repository]

  triggers = {
    repository_url = aws_ecr_repository.agentcore_runtime_agent_code_ecr_repository.repository_url
    region         = var.region
  }

  provisioner "local-exec" {
    command = <<-EOT
      # Check if image exists, if not push alpine:latest as placeholder 
      ...
    EOT
  }
}
...

Then define the AgentCore Runtime:

...

resource "aws_bedrockagentcore_agent_runtime" "agentcore_runtime" {
  agent_runtime_name = "agentcore_test_runtime"
  description        = "Agentcore runtime for the agentcore-test application"
  role_arn           = aws_iam_role.agentcore_runtime_role.arn
  protocol_configuration {
    server_protocol = "HTTP"
  }

  environment_variables = {
    BEDROCK_KNOWLEDGE_BASE_ID = aws_bedrockagent_knowledge_base.knowledge_base.id
  }

  authorizer_configuration {
    custom_jwt_authorizer {
      discovery_url   = "https://cognito-idp.${var.region}.amazonaws.com/${aws_cognito_user_pool.userpool.id}/.well-known/openid-configuration"
      allowed_clients = [aws_cognito_user_pool_client.userpool_client.id]
    }
  }

  agent_runtime_artifact {
    container_configuration {
      container_uri = "${aws_ecr_repository.agentcore_runtime_agent_code_ecr_repository.repository_url}:latest"
    }
  }

  network_configuration {
    network_mode = "PUBLIC"
  }

  depends_on = [null_resource.push_initial_image]
}

Key configuration points:

  • Knowledge base ID passed as an environment variable
  • Custom JWT authorizer with Amazon Cognito integration
  • Always uses latest Amazon ECR image tag

Post-Deployment Steps

Create Cognito User

1. Navigate to the created user pool in AWS Console

2. Go to Users under User Management

3. Create a user with the highlighted options and enter the email address of the user to invite

4. User receives email with subject "Your temporary password" from “no-reply” containing username and temporary password

Sync Data Source

1. Navigate to the created knowledge base

2. Select the data source and Sync to ingest documents

Upload Agent Image

Run the provided script to upload agent code to ECR (scripts/upload-agent-to-ecr.sh):

cd ../ # go back to root of project if necessary
cp scripts/env.agent.template scripts/.env.agent # make sure ECR repopsitory matches Terraform output
scripts/upload-agent-to-ecr.sh 

Testing the Agent

Update Temporary Password

aws cognito-idp initiate-auth \
  --auth-flow USER_PASSWORD_AUTH \
  --client-id <user-pool-client-id> \
  --auth-parameters USERNAME=<email>,PASSWORD=<tempPassword>

Response:

{
    "ChallengeName": "NEW_PASSWORD_REQUIRED",
    "Session": "AYABeMpHy...",
    "ChallengeParameters": {
        "USER_ID_FOR_SRP": "...",
        "requiredAttributes": "[]",
        "userAttributes": "{\"email_verified\":\"true\",\"email\":\"...\"}"
    }
}

Copy the Session value and update the password:

aws cognito-idp respond-to-auth-challenge \
  --region us-east-1 \
  --client-id <user-pool-client-id> \
  --challenge-name NEW_PASSWORD_REQUIRED \
  --session "<SESSION_FROM_PREVIOUS_CALL>" \
  --challenge-responses \
      USERNAME=<email>,NEW_PASSWORD=<newPassword>

Response includes AccessToken:

{
    "ChallengeParameters": {},
    "AuthenticationResult": {
        "AccessToken": "eyJra...",
        "ExpiresIn": 86400,
        "TokenType": "Bearer",
        "RefreshToken": "eyJjd...",
        "IdToken": "eyJra..."
    }
}

To reauthenticate, just run the initiate-auth shell command again with the new password. 

Invoke the Agent

Copy the AccessToken for authenticating our requests to AgentCore Runtime. We can now make a request to the AgentCore Runtime:

curl -X POST \
"https://bedrock-agentcore.us-east-1.amazonaws.com/runtimes/arn%3Aaws%3Abedrock-agentcore%3Aus-east-1%3A<ACCOUNT_ID>%3Aruntime%2F<AGENTCORE_RUNTIME_ID>/invocations?qualifier=DEFAULT" \
  -H "Authorization: Bearer <TOKEN>" \
  -H "Content-Type: application/json" \
  -H "X-Amzn-Bedrock-AgentCore-Runtime-Session-Id: session-5123123141231555555555555555555555214124215552" \
  -d '{"prompt": "Hello", "conversation_history": []}'

Important Notes:

Streamed Response:

data: {"type": "text", "text": "Hey"}
data: {"type": "text", "text": " there! "}
data: {"type": "text", "text": "👋 Welcome"}
data: {"type": "text", "text": "!"}
data: {"type": "text", "text": " How"}
data: {"type": "text", "text": " can I help you today?"}
data: {"type": "text", "text": " Feel"}
data: {"type": "text", "text": " free to ask me anything –"}
data: {"type": "text", "text": " I"}
data: {"type": "text", "text": "'m here to assist!"}

Now let’s ask a question about what’s in the knowledge base, about Drylab News, and their next AGM.

curl -X POST \
"https://bedrock-agentcore.us-east-1.amazonaws.com/runtimes/arn%3Aaws%3Abedrock-agentcore%3Aus-east-1%3A<ACCOUNT_ID>%3Aruntime%2F<AGENTCORE_RUNTIME_ID>/invocations?qualifier=DEFAULT" \
  -H "Authorization: Bearer <TOKEN>" \
  -H "Content-Type: application/json" \
  -H "X-Amzn-Bedrock-AgentCore-Runtime-Session-Id: session-5123123141231555555555555555555555214124215552" \
  -d '{"prompt": "When is the next AGM for Drylab news?",  "conversation_history": []}'

Response:

data: {"type": "text", "text": "According"}
data: {"type": "text", "text": " to the"}
data: {"type": "text", "text": " Drylab News"}
data: {"type": "text", "text": " newsletter"}
data: {"type": "text", "text": " from"}
data: {"type": "text", "text": " May"}
data: {"type": "text", "text": " 2017"}
data: {"type": "text", "text": ", the next Annual"}
data: {"type": "text", "text": " General Meeting (AGM) was"}
data: {"type": "text", "text": " scheduled for **June"}
data: {"type": "text", "text": " 16"}
data: {"type": "text", "text": "th"}
data: {"type": "text", "text": " at"}
data: {"type": "text", "text": " 15"}
data: {"type": "text", "text": ":00**"}
data: {"type": "text", "text": " (3:00 PM)."}
data: {"type": "text", "text": " An"}
data: {"type": "text", "text": " invitation"}
data: {"type": "text", "text": " was to"}
data: {"type": "text", "text": " be distribute"}
data: {"type": "text", "text": "d to all owners"}
data: {"type": "text", "text": " in"}
data: {"type": "text", "text": " advance"}
data: {"type": "text", "text": "."}

The agent successfully retrieves and synthesizes information from the knowledge base!

Let’s also test it on the image file added to the data source:

curl -X POST \
"https://bedrock-agentcore.us-east-1.amazonaws.com/runtimes/arn%3Aaws%3Abedrock-agentcore%3Aus-east-1%3A<ACCOUNT_ID>%3Aruntime%2F<AGENTCORE_RUNTIME_ID>/invocations?qualifier=DEFAULT" \
  -H "Authorization: Bearer <TOKEN>" \
  -H "Content-Type: application/json" \
  -H "X-Amzn-Bedrock-AgentCore-Runtime-Session-Id: session-5123123141231555555555555555555555214124215552" \
  -d '{"prompt": "In one sentence tell me about the method of the placebo effect experiment", "conversation_history": []}'

Response:

data: {"type": "text", "text": "The experiment"}
data: {"type": "text", "text": " teste"}
data: {"type": "text", "text": "d the placebo effect by having"}
data: {"type": "text", "text": " Par"}
data: {"type": "text", "text": "kinson's Disease"}
data: {"type": "text", "text": " patients receive treatments"}
data: {"type": "text", "text": " describe"}
data: {"type": "text", "text": "d as co"}
data: {"type": "text", "text": "sting $100"}
data: {"type": "text", "text": " an"}
data: {"type": "text", "text": "d then"}
data: {"type": "text", "text": " $"}
data: {"type": "text", "text": "1"}
data: {"type": "text", "text": "500"}
data: {"type": "text", "text": ","}
data: {"type": "text", "text": " measuring"}
data: {"type": "text", "text": " changes in their motor"}
data: {"type": "text", "text": " function"}
data: {"type": "text", "text": " after"}
data: {"type": "text", "text": " each"}
data: {"type": "text", "text": " administration"}

The agent successfully extracts and summarizes information from image-based documents, demonstrating Amazon Bedrock Data Automation's multimodal capabilities.

And that’s it! You now have a fully functional RAG application with streaming capabilities and secured with Amazon Cognito. 

Cleanup

To destroy all resources created during this hands up:

cd infra
terraform destory # yes 

Troubleshooting

When deploying the application for the first time, you may encounter a few common issues related to AWS configuration, permissions, or service access. The following troubleshooting tips address the most common issues encountered during setup.

  • Resource Naming Conflicts: Some AWS resource names must be globally or regionally unique. If Terraform reports a naming conflict, update the resource name in your configuration.
  • Insufficient Permissions: Ensure your AWS credentials have permissions for Amazon Bedrock, Amazon Bedrock AgentCore, Amazon S3, Amazon Cognito, Amazon ECR, and AWS IAM.
  • Unable to Invoke LLM: First-time users of Claude models must request access through the Amazon Bedrock console's model catalog.

Addressing these issues typically resolves the majority of deployment errors and allows the agent to start successfully.

Conclusion

You've successfully built and deployed a RAG application using Amazon AgentCore Runtime. This implementation demonstrates several key capabilities commonly used in real-world agent architectures:

  • Serverless scalability of agent execution through AgentCore Runtime with pay-per-use pricing
  • Secure authentication via Amazon Cognito JWT tokens
  • Streaming responses for improved user experience
  • Multimodal document understanding with Bedrock Data Automation
  • Infrastructure as Code for repeatable, version-controlled deployments

To take this further, consider extending the implementation with:

  1. Conversation Memory: Integrate AgentCore Memory for persistent conversation history
  2. Advanced Tools: Add custom tools via Amazon Bedrock AgentCore Gateway for database queries, API calls, or business logic
  3. Monitoring: Implement Amazon CloudWatch dashboards for usage metrics and performance tracking
  4. Multi-Environment Deployment: Use Terraform workspaces for dev/staging/prod environments

Amazon AgentCore provides a flexible foundation for building and scaling AI agents. By combining serverless execution, managed AI services, and infrastructure-as-code practices, this approach provides a practical reference for teams looking to move beyond simple prototypes as requirements evolve.

How Caylent Can Help

Building a production-ready agentic application requires more than standing up infrastructure; it demands the right architectural decisions, security posture, and cost controls from day one. Caylent brings deep AWS expertise and hands-on experience with Amazon Bedrock AgentCore to help organizations design, build, and scale secure AI applications with confidence. From RAG architectures and multimodal knowledge bases to authentication, observability, and infrastructure as code, our teams partner with you to move beyond POCs and deliver AI systems that are ready for real users, real traffic, and real business impact. Get in touch today to get started. 

Generative AI & LLMOps
Kevin Nha

Kevin Nha

Kevin is a Cloud Software Architect in the Cloud Native Applications practice at Caylent. He has built many solutions using TypeScript, Python, and Java, and has in-depth experience with building serverless applications on AWS. Having previously worked at Amazon, Kevin has an in-depth understanding of AWS technologies and closely works within the Leadership Principles. He enjoys building and rebuilding applications in the AWS ecosystem and helping clients build cloud-native applications.

View Kevin's articles

Learn more about the services mentioned

Caylent Catalysts™

Generative AI Strategy

Accelerate your generative AI initiatives with ideation sessions for use case prioritization, foundation model selection, and an assessment of your data landscape and organizational readiness.

Caylent Catalysts™

AWS Generative AI Proof of Value

Accelerate investment and mitigate risk when developing generative AI solutions.

Accelerate your GenAI initiatives

Leveraging our accelerators and technical experience

Browse GenAI Offerings

Related Blog Posts

Why Flat Tool Architectures Fail and How Amazon Bedrock AgentCore Enables Production-Grade

As enterprise AI systems scale, flat tool architectures create complexity, cost, and security risks. Explore how hierarchical architectures with Amazon Bedrock AgentCore solve the problem.

Generative AI & LLMOps

Whitepaper: The 2026 Outlook on Generative AI

Generative AI & LLMOps

Claude Sonnet 4.6 in Production: Capability, Safety, and Cost Explained

Explore the newly released Claude Sonnet 4.6, Anthropic's best general-purpose model in terms of price-performance.

Generative AI & LLMOps