09.19.19

AWS CloudWatch Container Insights

By Juan Ignacio Giro
Featured image - #AWSCloudWatchContainerInsights

AWS CloudWatch is already incredibly useful for monitoring AWS environments. CloudWatch is used in a wide range of setups to collect key metrics, monitor logs, and automate some parts of the monitoring and maintenance tasks. 

The recent addition of pod-level monitoring to CloudWatch makes the tool even more appealing. Dubbed the AWS CloudWatch Container Insights, this new pod-level monitoring works seamlessly with ECS and EKS clusters. Container Insights (now generally available) gives you a more granular view of the environment you use, particularly when you have containers running microservices.

CloudWatch Container provides monitoring information for almost all the AWS services in use. An important feature to note is its integration with the SNS service, which enables notifications to be sent (via email, SMS, HTTP/S endpoints, etc) when a given metric reaches a pre-defined threshold. It also integrates with services like SQS and Lambda to automate responses.

Pod-Level Monitoring at Its Best

It is not easy to identify bottlenecks and issues at pod-level without proper, fine-grained monitoring. While the cluster or server instance can be identified to be using more resources than normal, pinpointing the actual pod causing the anomaly can be a tedious process. AWS CloudWatch Container Insights is designed to eliminate that specific problem.

In essence, Container Insights previews, monitors, and diagnose pods running in an ECS or EKS cluster. It works seamlessly with EKS out of the box for easy integration, so those relying on the environment to run Kubernetes pods will find Container Insights useful.

AWS CloudWatch Container Insights doesn’t just offer an overview of your pods either. It goes deep into the key metrics related to performance, detects anomalies in real-time, and maintains an extensive set of logs and metadata from each pod. From this definition alone, it is easy to see how AWS CloudWatch Container Insights is useful in troubleshooting pod-related issues.

Container Insights for Amazon ECS is supported in the following Regions:

  • US East (N. Virginia)
  • US East (Ohio)
  • US West (N. California)
  • US West (Oregon)
  • Canada (Central)
  • EU (Frankfurt)
  • EU (Ireland)
  • EU (London)
  • EU (Paris)
  • Asia Pacific (Tokyo)
  • Asia Pacific (Seoul)
  • Asia Pacific (Singapore)
  • Asia Pacific (Sydney)
  • Asia Pacific (Mumbai)
  • South America (São Paulo)

AWS Fargate is not supported in EU (Paris) or South America (São Paulo).

Container Insights for Amazon EKS and Kubernetes is supported in the following Regions:

  • US East (N. Virginia)
  • US East (Ohio)
  • US West (N. California)
  • US West (Oregon)
  • Canada (Central)
  • EU (Frankfurt)
  • EU (Ireland)
  • EU (London)
  • EU (Paris)
  • Asia Pacific (Mumbai)
  • Asia Pacific (Singapore)
  • Asia Pacific (Sydney)
  • Asia Pacific (Tokyo)
  • Asia Pacific (Seoul)
  • South America (São Paulo)

(For more information, visit the AWS CloudWatch Container Insights documentation here.

Using AWS CloudWatch Container Insights

Setting up AWS CloudWatch Container Insights is easy. As mentioned before, the tool works really well with ECS and EKS. You only need to add the CloudWatch agent as a DaemonSet for every EKS cluster, and you are almost done.

That step allows for EKS or Kubernetes cluster to begin sending metrics and performance-related data to CloudWatch. The next thing you want to do is add FluentD as a DaemonSet for sending logs to CloudWatch. The combination gives you a more holistic view of your cluster.

Next, you want to enable the Kubernetes control plane logging for K8s control plane logs to be shipped to Amazon CloudWatch Logs. You can also create StatsD endpoint on your cluster if you also want to capture StatsD data. Statsd enables the collection and aggregation of custom application metrics (like the time spent in serving a given request). It requires the instrumentation of those metrics in the application source code. This will practically integrate all of your basic monitoring metrics in one platform.

Once the setup process is completed, you can access metrics from your containers on CloudWatch automatic dashboards. Everything from resource usage to potential errors is displayed. You can also use CloudWatch Logs insights to dig deep into the logs of your containers in the event of a server issue or anomaly.

Performance log events can be processed further. CloudWatch automatically analyzes the performance log events for every cluster, node (cluster workers), and pod, so you always have a clear view of your clusters at any point. Some decisions, such as the decision to scale your cluster up or allocate more resources to the instance, can now be data-driven. Well, log-driven.

Redefining Performance Monitoring

Upon closer inspection, you will find AWS CloudWatch Container Insights to offer a detailed look at the containers you run. This is handy for when you need to troubleshoot specific microservices as well as the entire web application, particularly when trying to find bottlenecks that slow the entire system down.

Among the performance metrics tracked by AWS CloudWatch Container Insights are:

  • Resource utilization: Including CPU usage, node CPU capacity, and node memory capacity. Container Insights relies on cAdvisor metrics as well as data from the nodes themselves.
  • Running containers: For monitoring the number of running containers per node in a cluster.
  • Running pods: For calculating the number of running pods and measuring whether the desired pods are met (and at an ideal state).
  • Network stats: This is handy for when there are network errors affecting the cluster.

Let’s not forget that alongside all these metrics we get the more important notification feature thanks to the integration with the SNS service.

Prometheus and Grafana

On the surface, AWS CloudWatch Container Insights is an alternative to the popular PrometheusGrafana combination. Users wanting all their monitoring and visualization dashboards in the same place (for AWS services and containerized applications) might benefit from this new AWS-specific platform.

That said, Prometheus can be even more detailed. It’s a super-powerful suite that has many more capabilities than the much newer Container Insights. Prometheus provides extensive configuration capabilities plus a powerful query language that enables users (among other things) to pre-compute expressions and save their results as a new set of time series data. It can also handle data coming from the same sources as AWS CloudWatch Container Insights which means it is now possible to set up an even more extensive monitoring system for your containers.

However, even though Prometheus is currently more powerful, it can require more time to properly set it up. We also expect additional features to be added into CloudWatch Container Insights in the future that will make it a great alternative to consider for monitoring containerized apps in AWS.

Things can only get better with more tools—and more extensive features—being made available!

Don’t miss our other post on monitoring tools, Container Monitoring: Prometheus and Grafana Vs. Sysdig and Sysdig Monitor


Caylent provides a critical DevOps-as-a-Service function to high growth companies looking for expert support with Kubernetes, cloud security, cloud infrastructure, and CI/CD pipelines. Our managed and consulting services are a more cost-effective option than hiring in-house, and we scale as your team and company grow. Check out some of the use cases, learn how we work with clients, and read more about our DevOps-as-a-Service offering.