Amazon Kinesis Data Streams On-demand vs. Provisioned Billing Mode Cost Comparison

September 21, 2023

Data Modernization & Analytics

Compare Amazon Kinesis Data Streams on-demand vs. provisioned pricing. See at what utilization levels provisioned mode saves money based on payload size and throughput.

This blog was originally written and published by Trek10, which is now part of Caylent.

When AWS released on-demand billing for Amazon Kinesis Data streams in Nov 2021, pitching it as “serverless,” my first thought was: isn’t Kinesis Data Streams more or less already serverless? Yes, there is an hourly cost per shard-hour in provisioned mode but when you are using more than 30% of the shard’s capacity the per-request cost is more than the shard-hour cost. A “true” serverless offering would mean that cost and usage are perfectly correlated. That is, zero usage would equate to zero cost—the amount of money you pay to use a serverless offering is purely determined by your actual usage of the service. Because Kinesis Data Streams’ cost [when using provisioned streams] is primarily determined by the per-request cost after shard utilization exceeds 30% (and we can expect that many real-world scenarios will involve > 30% shard utilization), we can say that the service is already almost serverless with respect to its billing model. With this in mind, let’s explore the cost differences between the new serverless billing mode for Kinesis Data streams when compared against the existing “almost serverless” provisioned mode.

The pricing structure (prices from us-east-1) between provisioned and on-demand streams has these differences:

Cost Type

Provisioned

On-demand

Metered Size (or how each record is rounded)

25kb

1k for writes (none for reads)

Hourly rate

$0.015 per shard hour

$0.04 per stream hour

Price per gb inbound (or metered chunk since all inbound traffic is rounded)

$0.014 per million PPUs (25kb chunk)

For 500 byte payloads this is $0.028/gb

For 25kb payloads this is $0.00056/gb

$0.08 per gb (payloads rounded up to nearest kb)

Price per gb outbound (or metered chunk) outbound, excluding enhanced fanout.

Free

$0.04/gb (no rounding)

Enhanced Fanout (used if you need to read more than 2mb/s per shard, basically you can use this if you have more than two readers on a stream)

$0.013/gb retrieved + $0.015/consumer-shard-hour

$0.05/gb retrieved

Extended Retention (7 days)

By default, records are only retained 24 hours in a stream

$0.02/shard hour

$0.10/gb month

Scaling behavior

Manually adjust number of shards by 50%-200% or split/merge adject shards

Automatic. Adjusts capacity to double peak usage in the past 30 days. 15 minute scaling delay

On-demand scales to double your peak write throughput in the previous 30 days. It doesn’t scale based on your reads. Also if you increase above double your peak usage within 15 minutes you can still get throttles. Hot shards can still be a problem since on-demand scaling mode doesn’t isolate specific hash keys. With provisioned mode, you have to specify the number of shards you want. You can scale a provisioned mode stream up or down by a factor of two. There is unfortunately no AWS built-in way to scale provisioned mode streams (e.g. Application Auto scaling doesn’t support Amazon Kinesis Data streams).

Reading data twice would favor on demand, comparing the cost of enhanced fanout is beyond the scope of this analysis.

Here are some graphs showing the cost vs. utilization rate for provisioned streams. Each graph also shows the cost for processing the same data in an on-demand stream. All the graphs show the cost for reading the written data once.* Since the payload size affects the pricing we have graphs for 100 byte, 1kb, 25kb, and 100kb payloads. (Because the provisioned streams have 25kb metering but the on-demand only has 1kb, the payload size will affect the price differently in the two modes. Generally, larger payloads favor provisioned streams since it has a larger metering size).

The cost shown in these graphs for provisioned mode is based on 100 shards and the per-record cost at the utilization rate on the x-axis. The cost for the on-demand mode is based on sending the equivalent data through an on-demand stream. Based on these graphs we can see that the cost starts favoring provisioned mode around when utilization is over 5%.

Based on the 5% utilization break-even point seen above, we can compare the two billing modes at various request sizes and request rates. The following graphs show the cost for provisioned and on-demand mode at various requests per second. The number of shards for provisioned mode cost is set such that only 5% of the stream’s capacity is used.

The four graphs from above show us that at 5% utilization, the provisioned mode is cheaper for all but the 100-byte payload size. Generally larger payload sizes favor provisioned mode because of the difference in rounding of the payload sizes: 1kb vs. 25kb respectively for on-demand and provisioned mode.

Let’s consider the following utilization scenarios to see which billing mode would be cheaper:

50% utilization for 8 hours a day and 0% utilized all other times gives 11% utilization on average, which from the graphs above we can see would be cheaper with provisioned mode.
50% utilization for 7 hours a week (e.g. 1hr a day) and 3% utilization at all other times gives 5% utilization on average. Depending on the payload size provisioned or on-demand could be cheaper.
75% utilization for 1 hour a week and 1% utilization all other times gives 2.3% utilization on average. For all payload sizes, on-demand would be cheaper in this case.

To measure the utilization rate of on-demand streams take the maximum value of the following two formulas:

(Sum(IncomingBytes) / Period) / (Number of shards * 1000000)
(Sum(IncomingRecords) / Period) / (Number of shards * 1000)

Where Period is the number of seconds specified in the CloudWatch Get Metric Statistics API call or in the CloudWatch Console. (Each shard can support 1mb/sec or 1000 records/sec, so we need to calculate both the utilization rate for bytes per second and utilization rate for records per second and take the maximum to find the rate that will limit the stream first)

Takeaways

If you need 3 or fewer shards, use provisioned. It’s always cheaper.
On-demand is a good fit if your utilization rate is < 5% and you can’t scale your provisioned shards down to increase your utilization rate.
Provisioned scaling can be hard. It can be slow to scale when it is not scaling up or down by a factor of 2. Scaling also is hard if you are using IaC to configure your Kinesis Data Streams. For advanced users with hot/cold shards, you can enable per-shard metrics and split/merge shards individually but this increases operational complexity.
Generally, on-demand is better for smaller payload sizes because of the 25kb metering in provisioned vs. 1kb metering for on-demand writes and no rounding for reads.
Best practice for any cost optimization is to right-size instances before purchasing RIs. Similarly, with Kinesis Data Streams, make sure you don’t have too many shards (e.g. utilization rate is too low) before switching to on-demand.

Data Modernization & Analytics

Trek10 Team

Founded in 2013, Trek10 helped organizations migrate to and maximize the value of AWS by designing, building, and supporting cloud-native workloads with deep technical expertise. In 2025, Trek10 joined Caylent, forming one of the most comprehensive AWS-only partners in the ecosystem, delivering end-to-end services across strategy, migration and modernization, product innovation, and managed services.

View Trek10's articles

Learn more about the services mentioned

Caylent Services

Cloud Operations & Managed Services

Reliably Operate and Optimize Your AWS Environment

Caylent Services

Infrastructure & DevOps Modernization

Quickly establish an AWS presence that meets technical security framework guidance by establishing automated guardrails that ensure your environments remain compliant.

Accelerate your cloud native journey

Leveraging our deep AWS expertise

Get in touch

Building a Simple AWS Data Warehouse Solution with Data Streaming

Build a serverless data warehouse on AWS by streaming Amazon DynamoDB data to Amazon S3 with AWS Lambda. A cost-effective architecture for historical analytics and business reporting.

Data Modernization & Analytics

Managed Services

IoT

February 9, 2023

Exploring the Depths of Amazon Kinesis Data Streams - Part 2: Scaling

Scale Amazon Kinesis Data Streams with low-cardinality partition keys using UpdateShardCount, manual shard splitting, and explicit hash values to balance traffic across shards.

Data Modernization & Analytics

Managed Services

IoT

March 8, 2023

Exploring the Depths of Amazon Kinesis Data Streams - Part 3: Advanced Features

Optimize Amazon Kinesis consumers with Parallelization Factor and Enhanced Fan Out. Learn how to scale Lambda processing per shard and overcome multi-consumer read limits.

Data Modernization & Analytics

Managed Services

IoT

View all blog posts

Takeaways

Trek10 Team

Learn more about the services mentioned

Cloud Operations & Managed Services

Infrastructure & DevOps Modernization

Accelerate your cloud native journey

Related Blog Posts

Building a Simple AWS Data Warehouse Solution with Data Streaming

Exploring the Depths of Amazon Kinesis Data Streams - Part 2: Scaling

Exploring the Depths of Amazon Kinesis Data Streams - Part 3: Advanced Features