Caylent Launches Caylent Accelerate™ for Agentic Cloud Operations

Creating AI Agents in Minutes with AgentCore harness

Generative AI & LLMOps

Explore how Amazon Bedrock AgentCore Harness enables teams to build, deploy, and scale production-ready AI agents in minutes by turning complex infrastructure and orchestration into simple configuration.

Since its very beginnings with Amazon SQS and Amazon S3, Amazon Web Services has been trying to manage the undifferentiated heavy lifting of building, deploying, and running software. With the advent of AI agents, their initial offering Amazon Bedrock, and the later AgentCore have continued that trend towards the agentic AI territory. On June 18th, AWS launched the next step to make AI agents easier to build, deploy, and run: Amazon Bedrock AgentCore harness.

The promise is to go from idea to a production-grade agent in minutes. Define an agent with one API call, invoke it with another, and skip the orchestration code, container build, memory wiring, tool adapters, and observability setup that normally separate a laptop prototype from something that can serve multiple users.

Caylent worked with AgentCore harness during its private beta in March 2026, before the April public preview, and our assessment has been consistently positive. Our initial questions were: 

  1. Does "production-grade in minutes" mean an agent ready for enterprise production, or production-grade infrastructure with the agent still to be built? 
  2. Is the agent you stand up that quickly a throwaway demo, or something a team can carry forward?

From our tests, we've found that AgentCore harness is very good at building the infrastructure for agents, which tends to be the tedious part. Moreover, for the right class of agents, the result is not disposable; it can reliably run in production. What AgentCore harness does not remove is the engineering judgment that turns a working agent into one an enterprise can run, or the business knowledge required to create an agent that's useful and valuable. This article explores the line between the scaffolding that AgentCore harness handles for you and the judgment it leaves in your hands.

What is Amazon Bedrock AgentCore harness?

An AI agent runs tools in a loop to reach a goal. Most of the work, typically centered on that loop, is choosing a framework, provisioning sandboxed compute, wiring tools, deciding where memory lives, configuring secrets and networking, and adding observability. Then teams need to repeat most of this work whenever the use case changes. A prototype on a laptop is the easy part, and serving real traffic adds concurrency, isolation, identity, state, and scaling simultaneously.

Harness is a feature of AgentCore that turns that work into configuration. AWS had already shipped the underlying pieces — Runtime, Memory, Gateway, Browser, Code Interpreter, Identity, and Observability — as separate primitives. AgentCore harness wraps them behind two API calls: CreateHarness to define the agent and InvokeHarness to run it. You declare the model, system prompt, tools, skills, memory, and limits, and AgentCore runs the loop, powered by the Strands Agents framework. Adding a tool or changing the model is an edit to that configuration, not a redeploy.

Underneath, the agent runs inside a contained sandbox on AgentCore runtime, which is what lets it read files and execute code without anyone having to build one. Each session runs in its own Lambda MicroVM with no shared state or filesystem, the agent gets a shell and file operations by default, and Python and Node execution are available in that sandbox. Every step streams back in real time and is automatically traced to Amazon CloudWatch.

What's New in the AgentCore harness General Availability Release

The preview of AgentCore harness lets you stand up an agent from configuration. General availability (GA) completes the picture by adding the capabilities required for long-term operation: keeping context across sessions, staying observable, reversing a bad change, and improving on real traffic.

Memory is on by default at GA: AgentCore harness provisions a managed, customer-owned memory resource with semantic and summarization strategies and per-tenant isolation keyed on the actor ID, so an agent recognizes a returning user and resumes a prior conversation without anyone having to replay message history. The memory stays an addressable resource you can query, attach elsewhere, or delete.

Skills are now enabled through a single toggle, giving agents access to an AWS-curated catalog of files and instructions that are loaded only when needed. These skills cover areas such as SDK usage, infrastructure as code, IAM, CloudWatch, and common service workflows.

Observability is now consolidated: a single view per harness and a Harnesses tab in CloudWatch GenAI Observability lets you drill from a harness, into a session, into a single trace, and see what ran, in what order, and where it failed.

Versioning gives changes a safety net: every update creates an immutable version of the full configuration, rollback is repointing an endpoint at an earlier version, and named endpoints such as PROD or STAGING stay pinned until you promote a new one, so a change can be rolled out and reversed without redeploying code.

In addition to that, AgentCore evaluations scores traces with built-in and custom evaluators. AgentCore optimization reads those scores to propose prompt and tool-description changes and validate them by routing live traffic between variants, so measuring an agent on real traffic and acting on the result is part of the service rather than something each team assembles.

GA also made an AgentCore harness invocation a first-class step in AWS Step Functions and added web search through AgentCore gateway, while also adding the ability to export to code an agent created with AgentCore harness.

Where AgentCore harness Helps

Using AgentCore harness pays off significantly wherever the bottleneck is infrastructure rather than agent design, which describes much of early-stage work: proofs of concept, proofs of value, and the demos teams use to show what an agent can do and to win funding for a project. When we tested it in private beta, the most consistent benefit was time. A small AWS-literate team, or an infrastructure-oriented builder without deep agent-framework experience, can quickly get a working agent up and running when orchestration, tool wiring, memory, runtime isolation, and observability are already handled. From our tests, a project taking 2 to 3 weeks can easily be done in a couple of days with AgentCore harness.

Model flexibility extends that speed into experimentation. You can switch model providers mid-session without losing context by setting a default model at creation, then overriding it on a single invocation, so an agent can plan with one model, write code with another, and summarize with a third while the conversation continues across the switch. For a team comparing price and performance, or moving off a model that just shipped a regression, the fact that this is an edit rather than a rebuild greatly simplifies A/B testing and exploration.

The same configuration-over-code property lowers the cost of iteration. Swapping a tool, replacing a skill, or rewording instructions is an edit in a config file, not a redeployment. The same harness can support a wide range of agents, from research and writing assistants to data analytics and coding agents, through configuration changes alone. At the validation stage, the cost of trying another idea is low enough to try several, which is a significant advantage.

If your use case doesn't need to support more than one framework and doesn't need anything outside the agent-loop pattern, you are ready to go to production using AgentCore harness. AgentCore harness uses sensible, well-tested defaults and is flexible enough to let you build the agent you want from the beginning, and powerful enough to support that build in production.

If you eventually need to move beyond AgentCore harness's agent-loop abstractions, a single command can export your agent as Strands-based code that runs on AgentCore runtime or other environments, with support for exporting to the Claude Agent SDK coming soon. The exported project preserves the model, system prompt, tools, memory wiring, skills, and container environment, and continues to run on the same compute, observability, and identity primitives. It is a translation of the configuration into code. This path is useful when you need custom multi-agent orchestration, graph- or workflow-style control, execution hooks, or bidirectional streaming.

What AgentCore harness Does and Doesn't Do

AgentCore harness handles the scaffolding, but you still need the engineering. The AgentCore CLI creates a role when it scaffolds a project, and AWS's sample policy grants broad, wildcard access, with explicit guidance on how to scope it down to the specific resources a production workload needs. AgentCore harness does a great job at providing a convenient default, but your team will need to tighten it for a production deployment. Remember, an agent with a shell, code execution, and tool access is only as contained as the role it assumes, so an over-permissive role on an agent that acts on its own is a serious risk.

Tool governance is another point where AgentCore harness provides a sensible default that you need to tighten. Connecting a tool is trivial. You point at an AgentCore gateway, and every target it exposes is available with authentication and per-tool authorization handled. Gateway can gate every call with Cedar-based policies, define who can call which tool, under what conditions, with which arguments, providing a better security posture. Alternatively, you can connect to a remote MCP server via a URL, without the security benefit of Gateway. AgentCore harness supports both options, teams should go with a direct MCP connection when there are few tools and use cases which don't require different sets of permissions, and with AgentCore gateway when it's critical to define which user, agent or use case can access which tools and with what arguments and conditions.

Designing for more than one user runs into a specific identity limit from AgentCore harness. When callers authenticate with AWS IAM through SigV4, AgentCore harness does not carry per-user identity into downstream tool calls. Per-user credential scoping, such as user-scoped tokens and on-behalf-of exchange through the Identity token vault, works only when callers authenticate with a bearer JWT through the inbound OAuth path. SigV4 support for per-user identity is planned for a future release, but for now, we recommend multi-tenant agents to either use AgentCore harness with bearer JWT, or skip AgentCore harness and use SigV4.

Evaluation and monitoring are available through AgentCore optimization when using AgentCore harness, but they're not automatic. The service can score traces and propose improvements (which is a big win for such an easy setup), but it's still up to you to define what "good" means for a given agent, for instance, which quality dimensions to score, the thresholds, the sampling strategy, and the cases that must never regress. AgentCore harness gives you the improvement loop, you need to bring the judgment that makes the loop produce actual improvements.

Conclusion

AgentCore harness is good at the genuinely tedious part of building agents. It turns the infrastructure, orchestration, tool wiring, memory, runtime isolation, and observability into configuration, so that an AWS-literate team or an infrastructure-minded builder can quickly get a working agent running. For proofs of concept, proofs of value, or the first iteration of almost any agent, AgentCore harness provides a significant advantage in speed and cost, which matches what we observed building with AgentCore harness when we tested it in private beta.

An important detail about AgentCore harness is that it does not add a separate layer of pricing on top of the underlying services. You pay for the underlying capabilities on a consumption basis, with the primary cost driver typically being AgentCore runtime usage. AgentCore runtime is billed per second of actual CPU and memory consumption at $0.0895 per vCPU hour and $0.0945 per GB hour. Idle time and IO wait are not billed, which helps ensure you are only paying for active compute.

The main reason to avoid AgentCore harness upfront is when you already know you will need capabilities it does not support, such as propagating per user identity downstream with SigV4 as discussed earlier. If that is not clear at the outset, it is generally safe to start with AgentCore harness and later export the agent to Strands when you need more control or flexibility.

What your team still needs to bring to the table is engineering judgment. AWS permissions will function out of the box, but still need to be tightened to the principle of least privilege. Tool setup is simple, but you are responsible for defining the right security boundaries. Evaluation and monitoring are available, but you still need to decide what to optimize for and how to interpret improvement over time.

How Caylent Can Help

While Amazon Bedrock AgentCore harness accelerates agent development by abstracting infrastructure and operational complexity, production success still depends on strong engineering discipline around security, governance, evaluation, and alignment to business outcomes. Caylent helps organizations bridge this gap by designing and building production-ready agentic systems on AWS, including AgentCore-based architectures. We guide teams in making the right architectural decisions, whether to continue with AgentCore harness, extend with Strands, or adopt custom orchestration, so agents are secure, scalable, and built for long-term success. Reach out to us today to get started. 

Generative AI & LLMOps
Guille Ojeda

Guille Ojeda

Guille Ojeda is a Principal Innovation Architect at Caylent, a speaker, author, and content creator. He has published 2 books, over 200 blog articles, and writes a free newsletter called Simple AWS with more than 45,000 subscribers. He's spoken at multiple AWS Summits and other events, and was recognized as AWS Builder of the Year in 2025.

View Guille's articles

Learn more about the services mentioned

Caylent Catalysts™

Generative AI Strategy

Accelerate your generative AI initiatives with ideation sessions for use case prioritization, foundation model selection, and an assessment of your data landscape and organizational readiness.

Caylent Catalysts™

AWS Generative AI Proof of Value

Accelerate investment and mitigate risk when developing generative AI solutions.

Accelerate your GenAI initiatives

Leveraging our accelerators and technical experience

Browse GenAI Offerings

Related Blog Posts

AWS Context: AWS's Automated Knowledge Graph for AI Agents

Explore how AWS Context creates a governed knowledge graph that gives AI agents the enterprise context needed to reason over data, enabling more accurate, secure, and scalable AI applications.

Generative AI & LLMOps

Claude Sonnet 5 Launch Analysis: What Changed, What Matters, and What to Validate

Caylent’s analysis of Claude Sonnet 5, Anthropic's most agentic Sonnet model yet, with improvements in coding, agentic workflows, computer use, professional knowledge work, and tool use.

Generative AI & LLMOps

From Modernization to Agentic Operations: How Caylent and AWS Are Helping Customers Move Faster with AI

Explore how Caylent and AWS are helping organizations move from AI experimentation to execution by combining agentic AI, modernization, and next-generation cloud operations to deliver faster, more secure, and more scalable business outcomes.

Managed Services
Generative AI & LLMOps