Caylent Launches Caylent Accelerate™ for Agentic Cloud Operations

Amazon ElastiCache for Valkey Now Supports Durability

Serverless & Containers

Explore how Amazon ElastiCache for Valkey's new durability feature enables organizations to use it as a persistent data store for workloads such as AI agent memory, RAG knowledge bases, and real-time applications, while examining the tradeoffs.

As of June 2026, Amazon ElastiCache for Valkey can store data durably. A cluster persists writes across Availability Zones through a Multi-AZ transactional log, so it can recover committed data after a node or shard failure and serve as the store of record for workloads that cannot tolerate loss, rather than merely being a cache in front of a separate, durable store.

Synchronous writes mode guarantees zero loss. Asynchronous writes mode keeps cache-level speed but leaves a bounded window of recent writes at risk of being lost. You set the mode at cluster creation, depending on what a workload can afford to lose, how fast its writes must be, and how it should behave when the system is under pressure.

In this blog, we dive into how your cloud architecture changes with this new ElastiCache for Valkey feature, what the costs are, and which architecture decisions it enables.

What Durability Changes About ElastiCache's Role

Amazon ElastiCache was designed as a cache: fast and disposable. Data lives in memory, a replica and automatic failover handle availability, and anything lost can be refetched from the system of record that sits behind it. That works fine so long as there's a separate, durable data store. Any data in ElastiCache can be lost at any time, and your system should be designed under that assumption.

The main drawback of using ElastiCache, or any cache really, is that you're paying thrice: once for the durable data store, once again for ElastiCache, and once yet again for the added complexity of managing cache writes, cache misses, and the possibility of a complete cluster failure overloading your durable data store. At a glance, Durability gives you the best of both worlds: a data store that's as durable as a regular database, with ElastiCache for Valkey's lookup times, and without the added complexity. Writes pass through a Multi-AZ transactional log that survives node and shard failures, so in the event of a complete cluster failure, a cluster can recover committed data rather than lose it.

This widens what ElastiCache can do, though it doesn't turn it into a general-purpose database. It is still an in-memory engine with the same data structures and microsecond reads; durability is a per-cluster property you turn on for data you need to store durably. Examples are knowledge bases for RAG, long-term and working memory for AI agents, payment tokenization, real-time inventory, session stores, and leaderboards. This is data that is unacceptable or expensive to lose, a use case a disposable cache like ElastiCache was never meant to handle, until now.

Synchronous and Asynchronous Writes

What separates the two modes is a single timing decision: whether the client hears "success" before or after the write is persisted. That ordering sets the durability guarantee, the write latency, and how the cluster behaves under load.

Synchronous writes persist first and are acknowledged after. The primary writes the change across at least two Availability Zones (AZs) in the transactional log and only then acknowledges the client. Every acknowledged write is durable: if the primary node fails an instant later, the write survives in at least two other AZs, and reads from the primary node still reflect it, including after a failover to a new primary node. The latency cost is a cross-AZ round trip for every write, increasing write latency from microseconds to single-digit milliseconds.

Asynchronous writes acknowledge first and persist after. The primary applies the change in memory, returns success at ElastiCache's usual microsecond latency, and streams the write to the transactional log afterward. In the gap between the acknowledgment and that stream, the write exists only in the primary node's memory, and a failure of the primary node within that gap will cause the write to be lost. Up to roughly 10 seconds of acknowledged writes can be lost this way in a failure.

Reads remain in the microsecond range in both modes, and durability does not affect read consistency; they are separate properties. The primary is strongly consistent with synchronous writes, but replicas are eventually consistent under either mode and can trail the latest writes. A workload that writes state and immediately reads it back from a replica, common in RAG and agent memory, can still read stale data even with durability switched on.

In AWS's latency benchmarks, asynchronous writes ran within microseconds of a non-durable cluster, and synchronous writes kept microsecond reads at moderate load, with read latency rising only as write concurrency grew.

Drawbacks of Asynchronous Mode

Asynchronous durability maintains the microsecond write latency of a non-durable cluster while keeping the same cost. That makes it the right default for anything that can accept that a primary node failure might delete the last 10 seconds of writes.

The primary node tracks the age of the oldest write that has not yet been written to the transactional log and reports it to Amazon CloudWatch. While that lag stays under 10 seconds, writes flow normally. When it crosses the threshold, for example, during a spike in writes that causes a transient congestion in the transaction log, the primary stops accepting writes until it catches up, then resumes on its own. Reads keep running at microsecond latency throughout. The rejection state is expected to be rare in normal operations, but if you choose asynchronous mode you need to accept a brief, bounded pause in writes under stress, and up to 10 seconds of loss in the event of a failure, in exchange for the speed and cost of a normal cache.

Your backend needs to be prepared to handle the occasional rejected write with automatic retries and exponential backoff. The Valkey GLIDE client does this automatically, and adds Availability Zone-aware routing. However, a write path that cannot tolerate even a brief pause is a reason to choose synchronous writes instead.

With the three options in view, no durability, asynchronous, and synchronous, the choice is reduced to three questions: 

  1. How much recent data a workload can lose? 
  2. How fast must its writes be?
  3. Can it pause writes briefly under stress?
No durability
Asynchronous writes
Synchronous writes

Data-loss guarantee

Data is lost on failure and rebuilt from the source

Up to ~10 seconds of acknowledged writes at risk

Zero loss for every acknowledged write

Typical write latency

Microseconds

Microseconds

Single-digit milliseconds

Behavior under write stress

Writes always accepted

Writes rejected if durability lag exceeds 10 seconds, until it recovers

Writes always accepted; bounded by cross-AZ persistence latency

Added cost

None

None

18% increase on the node-hour cost. Plus the increased latency

Representative workloads

Read-through caches, rate limiters, recomputable data

Session stores, leaderboards, real-time analytics, pre-loaded data

RAG knowledge bases, agent memory and state, payment tokenization, real-time inventory

No durability is still the right choice when data is trivially reconstructable and uninterrupted writes are more important than persistence. Examples include read-through caches, rate limiters, anything recomputable on demand. Asynchronous mode covers the broad middle, where rebuilding from the source is slow or costly but a few seconds of loss can be reconciled. Synchronous mode is for cases where a single lost write produces incorrect behavior and single-digit-millisecond writes are acceptable.

Requirements to Enable Durability on ElastiCache for Valkey

Durability is selected at cluster creation, and it cannot be disabled. You choose it when you create the cluster, and you can switch a durable cluster between synchronous and asynchronous writes afterward. However, you cannot add durability to an existing non-durable cluster, and you cannot turn it off. Here are the requirements for enabling durability in a new cluster:

Requirement
What it means for adoption

Valkey 9.0 or later on R7g, R6g, M7g, M6g, or C7gn nodes

Clusters on other engines, versions, or instance families must be rebuilt to adopt durability

Cluster Mode Enabled, with Multi-AZ and at least one replica per shard

Single-node and cluster-mode-disabled layouts do not qualify; durability assumes a replicated, sharded topology

Encryption at rest (enabled automatically) and TLS in transit (required at creation)

Durable clusters are encrypted by default, which suits regulated workloads, and every client must use TLS

Up to 100 MiBps write throughput per primary node

Write-heavy workloads scale out across shards rather than rely on one large primary

Not available on ElastiCache Serverless

Durability is limited to node-based clusters

Durability is available in all AWS commercial Regions, the China Regions, and GovCloud (US), so Region coverage is unlikely to be the blocker.

How Durable ElastiCache Compares to Amazon MemoryDB

Treating ElastiCache as a durable store raises an immediate question: How is this different from Amazon MemoryDB, which AWS already sells as a durable in-memory database? The services overlap heavily, though choosing between them depends on more than durability.

MemoryDB is a durable in-memory primary database built on the same kind of Multi-AZ transactional log, with microsecond reads, single-digit-millisecond writes, and strongly consistent primaries. Its durability model is effectively the synchronous mode of ElastiCache: every write is persisted before it is acknowledged.

What ElastiCache adds is the asynchronous option: durable writes with microsecond latency, with a bounded loss window as the trade-off. For a workload that wants persistence without sacrificing write speed, that option is new to this family of services. Past that, the choice comes down to engine and feature fit, multi-Region requirements, and the migration path.

Conclusion

The decision of what kind of durability you want from ElastiCache depends on what your workload needs and tolerates. Don't use durability when data is easily and cheaply reconstructable, and uninterrupted writes are key. Use durability with asynchronous mode when a few seconds of loss is recoverable, but a full rebuild would be costly, and you can tolerate retrying on writes. Choose synchronous mode when every acknowledged write needs a guarantee that it's durable, and single-digit-millisecond writes are acceptable. What separates them is how much recent data a workload can lose, how fast its writes must be, and how sensitive it is to retrying writes.

How Caylent Can Help

As organizations look to build faster, more resilient applications, choosing the right data architecture is more important than ever. Caylent helps customers evaluate where durable in-memory data stores like Amazon ElastiCache for Valkey fit within their broader cloud and AI strategy, from RAG knowledge bases and agent memory to real-time applications and high-performance transactional workloads. As an AWS Premier Tier Services Partner, we help organizations design, implement, and optimize scalable architectures that balance performance, durability, cost, and operational simplicity so teams can move from experimentation to production with confidence. Get in touch with us today to get started.

Serverless & Containers
Guille Ojeda

Guille Ojeda

Guille Ojeda is a Principal Innovation Architect at Caylent, a speaker, author, and content creator. He has published 2 books, over 200 blog articles, and writes a free newsletter called Simple AWS with more than 45,000 subscribers. He's spoken at multiple AWS Summits and other events, and was recognized as AWS Builder of the Year in 2025.

View Guille's articles

Learn more about the services mentioned

Caylent Catalysts™

Serverless App

Design new cloud native applications by providing secure, reliable and scalable development foundation and pathway to a minimum viable product (MVP).

Accelerate your cloud native journey

Leveraging our deep experience and patterns

Get in touch

Related Blog Posts

AWS Control Tower vs. AWS OrgFormation: Which Should You Choose?

If you are configuring AWS accounts from scratch you may be wondering, which is the best fit for me? Here we weigh the pros & cons.

Serverless & Containers
Managed Services

AWS Lambda Function Performance: Parallelism in Python with Boto3 and Aioboto3

Benchmark three approaches to parallel S3 calls in Python Lambda: synchronous boto3, boto3 with asyncio, and aioboto3. See which performs best at different memory allocations.

Managed Services
Serverless & Containers

Best Practices for Secondary Indexes with Amazon DynamoDB

Learn Amazon DynamoDB secondary index best practices, including index overloading for multiple entity types, sparse indexes to cut storage costs, and sorting on mutable attributes.

Serverless & Containers
Managed Services