Caylent Services
Infrastructure & DevOps Modernization
Quickly establish an AWS presence that meets technical security framework guidance by establishing automated guardrails that ensure your environments remain compliant.
Learn how to measure cross-AZ data transfer costs using default VPC Flow Logs and CloudWatch Logs Insights—even without AZ IDs in your logs.
This blog was originally written and published by Trek10, which is now part of Caylent.
As part of our CloudOps offering here at Trek10, we monitor our clients' AWS accounts for cost anomalies (sudden and/or gradual increase/decrease in AWS Cost). While investigating a cost-related alert for a client I determined that there was a significant increase ($100+ per day) in the poorly named cost code USW2-DataTransfer-Regional-Bytes. This is of course the charge for data transfer between Availability Zones in the same region.
The client which received this alert had recently made some changes to their application but with half a dozen different application clusters and another half-dozen different database/cache clusters, it wasn't obvious what would be generating the massive increase in traffic. Fortunately, the client had VPC flow logs enabled for all of their VPCs. Unfortunately, the client's VPC flow logs were using the default format which does not include the Availability Zone ID in the logs. Not a problem, I would just need to map from the various private IP addresses in the flow logs to the subnet ID. “Great,” I thought, “I'll just use a switch statement with the isIpv4InSubnet function. CloudWatch Logs Insights has a switch or choice function, right?” Wrong.
No problem, we can construct a switch function using other existing functions. If we concatenate all possible outputs we need into a single string, we can extract what we need using the substring function substr. E.g. if we need to select between option1, option2, and option3, we can use a function call similar to the following to get the correct option:
substr("option1option2option3", <indexToStartOfStringHere>, 7)Since we can choose from multiple options using substr, we just need a way to calculate the starting position of the correct result (e.g. 0 for option1, 7 for option2, and 14 for option3 in the example above). CloudWatch Logs Insights allows implicit conversion of boolean true/false values into 1 and 0, respectively. Because of this implicit conversion, we can use the conditions that we would normally use in a switch/choice/ifs statement to calculate the index of the correct result. We can multiply each condition by the index of the output that corresponds to that condition and then sum all of these products. The formula becomes:
substr("output1output2output3", condition1 * 0 + condition2 * 7 + condition3 * 14, 7)The only caveat is that all of the conditions must be mutually exclusive or you'll get incorrect results, since if more than one condition is true the resulting index will either be larger than the length of the string or match the index of a subsequent condition. E.g. if in the previous example condition1 and condition2 were both true then the output would be output3.
You can also add a default at the start of the string like in the following example:
Example: suppose you have a metric in the logs that you need to categorize into three buckets, micro, small, and large. The options string would look like microsmalllarge. If the threshold between micro and small is 10, and the threshold between small and large is 1000, then the full formula would be:
substr("othermicrosmalllarge", (N < 10) * 1 * 5 + (N >= 10 AND N<1000) * 2 * 5 + (N > 1000) * 3 * 5, 5)Now, if you have more than a couple of conditions this becomes cumbersome to write manually so I just threw together a script to programmatically generate the query itself. This script evaluates all the subnets and their respective availability zones to produce the functions I needed inside my query. This client had no overlapping subnet CIDR ranges so I could make a few assumptions to simplify the logic. Once written, this script allowed me to then copy-paste its output into the AWS Console to create the appropriate query in CloudWatch Logs Insights.
#!/usr/bin/env node
const { EC2Client, paginateDescribeSubnets } = require("@aws-sdk/client-ec2");
const ec2Client = new EC2Client()
/**
*
* @param {String} vpcId
* @returns {Subnet[]}
*/
async function getAllSubnets(vpcId) {
let subnets = [];
for await (const page of paginateDescribeSubnets({client: ec2Client}, {})) {
subnets = subnets.concat(page.Subnets ?? [])
}
return subnets
}
async function main() {
let subnets = await getAllSubnets();
// case statement for ip -> cidr -> azId
let azIds = ['other'].concat(subnets.map(subnet => subnet.AvailabilityZoneId));
let longestAzId = Math.max(...azIds.map(a=>a.length));
let paddedAzIds = azIds.map(a => a.padEnd(longestAzId,' ')).join('');
let field = 'srcAddr';
// this assumes all the subnets are not overlapping
let subnetChecks = subnets.map((subnet, index) => `isIpv4InSubnet(${field}, "${subnet.CidrBlock}") * ${index+1} * ${longestAzId}`).join (' + ')
let statement = `trim(substr("${paddedAzIds}", ${subnetChecks}, ${longestAzId})) as srcAz`
console.log(statement)
field = 'dstAddr';
// this assumes all the subnets are not overlapping
subnetChecks = subnets.map((subnet, index) => `isIpv4InSubnet(${field}, "${subnet.CidrBlock}") * ${index+1} * ${longestAzId}`).join (' + ')
statement = `trim(substr("${paddedAzIds}", ${subnetChecks}, ${longestAzId})) as dstAz`
console.log(statement)
}
main()Using this script to generate the formula to get the AZ ID for each IP address I was able to finalize my Cloudwatch Logs Insights query to see which pair of IP addresses that were in different AZs had the most traffic.
stats sum(bytes) as bytesTransferred by srcAddr, dstAddr,
trim(substr("other use1-az4use1-az6use1-az4use1-az4use1-az6use1-az4use1-az4use1-az4use1-az6use1-az4use1-az2use1-az1use1-az4use1-az3use1-az4use1-az2use1-az6use1-az6use1-az2use1-az1use1-az6", isIpv4InSubnet(srcAddr, "10.188.0.0/24") * 1 * 8 + isIpv4InSubnet(srcAddr, "10.188.1.0/24") * 2 * 8 + isIpv4InSubnet(srcAddr, "10.188.4.0/24") * 3 * 8 +
isIpv4InSubnet(srcAddr, "10.0.2.0/24") * 4 * 8 +
isIpv4InSubnet(srcAddr, "10.188.9.0/24") * 5 * 8 +
isIpv4InSubnet(srcAddr, "10.0.4.0/24") * 6 * 8 +
isIpv4InSubnet(srcAddr, "10.188.3.0/24") * 7 * 8 +
isIpv4InSubnet(srcAddr, "10.188.10.0/24") * 8 * 8 +
isIpv4InSubnet(srcAddr, "10.0.5.0/24") * 9 * 8 +
isIpv4InSubnet(srcAddr, "172.16.1.0/24") * 10 * 8 +
isIpv4InSubnet(srcAddr, "10.188.11.0/24") * 11 * 8 +
isIpv4InSubnet(srcAddr, "10.188.8.0/24") * 12 * 8 +
isIpv4InSubnet(srcAddr, "172.31.16.0/20") * 13 * 8 +
isIpv4InSubnet(srcAddr, "172.31.48.0/20") * 14 * 8 +
isIpv4InSubnet(srcAddr, "10.0.0.0/24") * 15 * 8 +
isIpv4InSubnet(srcAddr, "172.16.0.0/24") * 16 * 8 +
isIpv4InSubnet(srcAddr, "172.31.0.0/20") * 17 * 8 +
isIpv4InSubnet(srcAddr, "10.0.3.0/24") * 18 * 8 +
isIpv4InSubnet(srcAddr, "172.31.32.0/20") * 19 * 8 +
isIpv4InSubnet(srcAddr, "10.188.2.0/24") * 20 * 8 +
isIpv4InSubnet(srcAddr, "10.0.1.0/24") * 21 * 8, 8)) as srcAz,
trim(substr("other use1-az4use1-az6use1-az4use1-az4use1-az6use1-az4use1-az4use1-az4use1-az6use1-az4use1-az2use1-az1use1-az4use1-az3use1-az4use1-az2use1-az6use1-az6use1-az2use1-az1use1-az6", isIpv4InSubnet(dstAddr, "10.188.0.0/24") * 1 * 8 + isIpv4InSubnet(dstAddr, "10.188.1.0/24") * 2 * 8 + isIpv4InSubnet(dstAddr, "10.188.4.0/24") * 3 * 8 +
isIpv4InSubnet(dstAddr, "10.0.2.0/24") * 4 * 8 +
isIpv4InSubnet(dstAddr, "10.188.9.0/24") * 5 * 8 +
isIpv4InSubnet(dstAddr, "10.0.4.0/24") * 6 * 8 +
isIpv4InSubnet(dstAddr, "10.188.3.0/24") * 7 * 8 +
isIpv4InSubnet(dstAddr, "10.188.10.0/24") * 8 * 8 +
isIpv4InSubnet(dstAddr, "10.0.5.0/24") * 9 * 8 +
isIpv4InSubnet(dstAddr, "172.16.1.0/24") * 10 * 8 +
isIpv4InSubnet(dstAddr, "10.188.11.0/24") * 11 * 8 +
isIpv4InSubnet(dstAddr, "10.188.8.0/24") * 12 * 8 +
isIpv4InSubnet(dstAddr, "172.31.16.0/20") * 13 * 8 +
isIpv4InSubnet(dstAddr, "172.31.48.0/20") * 14 * 8 +
isIpv4InSubnet(dstAddr, "10.0.0.0/24") * 15 * 8 +
isIpv4InSubnet(dstAddr, "172.16.0.0/24") * 16 * 8 +
isIpv4InSubnet(dstAddr, "172.31.0.0/20") * 17 * 8 +
isIpv4InSubnet(dstAddr, "10.0.3.0/24") * 18 * 8 +
isIpv4InSubnet(dstAddr, "172.31.32.0/20") * 19 * 8 +
isIpv4InSubnet(dstAddr, "10.188.2.0/24") * 20 * 8 +
isIpv4InSubnet(dstAddr, "10.0.1.0/24") * 21 * 8, 8)) as dstAz
| filter srcAz != dstAz
| sort bytesTransferred desc
| limit 50This query quickly enabled me to identify which ENIs were the cause of the extra traffic. It turns out, the application was scanning a Redis cluster on every operation instead of just getting individual items.
As you can see from the example above you can use this technique to group or categorize your logs for easier analysis when using CloudWatch Logs Insights.
Hopefully, you've found this article helpful. If you're looking for assistance with Amazon CloudWatch or other monitoring components for your AWS environment, please feel free to contact us. We'd love to chat!
Founded in 2013, Trek10 helped organizations migrate to and maximize the value of AWS by designing, building, and supporting cloud-native workloads with deep technical expertise. In 2025, Trek10 joined Caylent, forming one of the most comprehensive AWS-only partners in the ecosystem, delivering end-to-end services across strategy, migration and modernization, product innovation, and managed services.
View Trek10's articlesCaylent Services
Quickly establish an AWS presence that meets technical security framework guidance by establishing automated guardrails that ensure your environments remain compliant.
Caylent Catalysts™
Connect, understand, and act on data from industrial devices at scale to improve uptime, efficiency, and reliability across manufacturing, energy, and utilities.
Caylent Services
Reliably Operate and Optimize Your AWS Environment
Build a crypto price tracker with AWS Lambda, Kraken's API, and Datadog—monitor Bitcoin, Chainlink, or any cryptocurrency with custom dashboards.
A fresh perspective on the capabilities of serverless technologies and what that means in terms of cost, services, and required knowledge for your business.
Reduce AWS VPC costs with these tips on data transfer pricing, NAT Gateway optimization, and using VPC Gateway Endpoints for S3 and DynamoDB.