Whether you're running a small business or an enterprise, it's critical to ensure all of the services under your responsibility are available. Reacting to incidents as they come up is essential, but proactively monitoring services is key to preventing issues from cropping up in the first place. In AWS, administrators can keep their finger on the pulse of their infrastructure by using Amazon CloudWatch.
What is Amazon CloudWatch?
Amazon CloudWatch is a service from Amazon Web Services that provides users with insightful data, analytics to understand and respond to incidents in their AWS environment. CloudWatch is a service that collects logs and event data to provide users with a view of the state of their cloud infrastructure. CloudWatch can be that single pane of glass all of the marketers talk about giving you dashboards, insights, and tooling to control your entire cloud infrastructure.
CloudWatch allows administrators to monitor, create alerts and troubleshoot their AWS infrastructure for many different resources like EC2, S3, RDS, elastic load balancers and more. But if you've got some infrastructure in AWS and aren't too familiar with CloudWatch, what do you do? I'm here to help!
In this article, we're going to dive into Amazon CloudWatch and show you what it takes to build an informational dashboard bringing together some of the most common AWS infrastructures.
To get to CloudWatch, head over to the CloudWatch portal and log in. At first glance, you'll see the summary screen showing you information like alarms, events, logs and all of the available metrics available to you. Depending on the services you have in AWS supported by CloudWatch, your available metrics may differ but notice from the screenshot that I have 34 available to me.
I'll click on Browse Metrics and see what trouble I can get into here. I immediately notice that CloudWatch categories metrics by different services. In my case, I have EBS, EC2, and S3 with the number of metrics available.
Clicking on EC2, I then choose Per-Instance (AWS categories metrics based on service type) and immediately am presented with all of the available metrics I have at my disposal.
Because we're just getting started with CloudWatch, I've clicked on all of the metrics available to me and notice that they are added to the graph above. If you're familiar with Windows resource monitor or other performance monitoring graphs, you'll see the similarities. By default, this graph requires manually refreshing, but I've chosen to Auto-Refresh it every 10 seconds. In the screenshot below, I've also chosen to see one hour's worth of data.
That's a pretty graph to stare at but not about to spend my day gazing into an auto-refreshing graph hoping I don't see any abnormal activity. Instead, I'd like to be alerted when a certain threshold is met. To do that, I can create an alarm. In CloudWatch, alarms allow users to be notified via email when a metric has met or exceeded a specified threshold.
On the Create Alert screen, once you've chosen the metric you'd like to track, you can then set the threshold a few different ways by setting the start to ALARM when the metric exceeds a certain threshold, is under a threshold and how many data points (polling intervals) the metric has to be at that point before an alarm is triggered. If you're not familiar with what is considered "normal", you also have the graph on the right in the screenshot below to give you an indication of what the metric is usually at.
CloudWatch also allows the user not only to monitor various performance metrics and alerts on those metrics but can also perform actions when specific events happen. This allows the user to automate multiple tasks when triggered by many different activities occurring in their AWS environment. By clicking on the Events section, the user can create rules to subscribe to these events and take action.
In the screenshot below, I've chosen to create a rule to monitor my S3 buckets for all events, although I could have gotten more granular. Notice that by selecting the dropdowns in the event source section, the UI automatically creates the event pattern. This will be the JSON CloudWatch consumes when setting up this event listener.
I've already set up a Lambda function as well, so I've chosen that. This will ensure when any activity happens in any of my S3 buckets, my Lambda function will automatically be triggered.
If, for example, I don't necessarily want to invoke my Lambda function on a specific event, I could also set up a schedule as seen below. A schedule allows me to be sure my Lambda function gets invoked without having to depend on a certain event firing.
CloudWatch isn't tied to the AWS event bus but can also display, summarize, and alert on text file logging activity as well. This feature allows users to install a logging agent on EC2 instances to send text file log information like Apache logs, get notified of operating system-specific events or keep tabs on event logs. To do that, Amazon has a informative tutorial page on how to set up the logging agent.
Since I don't have any EC2 instances set up in my demo environment, I'm not able to set up the logging agent and show screenshots of the data gathered. If you'd like more information on using the Logs feature of CloudWatch, I encourage you to check out Amazon CloudWatch Logs page for more details.
One of the coolest features of CloudWatch is its dashboards feature. Using Dashboards, users can create informational dashboards displaying lots of different data all in a single place. This is especially useful when a user needs to get a bird's eye view of the state of their infrastructure.
Creating a new dashboard consists of adding one or more widgets.
For this article, I'll add a line metric widget to give me an overview of various metrics over time. Once I add the widget, I'm presented with a screen exactly like the metric screen we were looking at earlier. This time, however, we're able to combine this graph with another source of information in the dashboard.
By clicking on Add Widget, I'm able to add multiple different types of widgets including the aforementioned line graph, stacked area graphs, numbers to see the last value of a metric, and even a free text widget that allows users to display text in their dashboard.
Just like we can with metrics, we're able to adjust the refresh interval for all of the widget data on the right-hand side. We're also able to manage dashboards in the Actions drop-down and perform tasks like save dashboards, rename, delete them, and so on.
Amazon CloudWatch is a great tool for not only up/down alerts and performance monitoring, but with its alerts feature, it is an all-in-one solution for monitoring AWS infrastructure. How AWS has forced many services to use the standard AWS event bus is pretty smart. This allows CloudWatch to subscribe to a single stream and tap into all of the information flowing across.
The inclusion of the log data source is helpful, too. By installing the logging agent on various instances in your environment, you're able to merge all of the data in the event bus and obscure log files together. Getting started with Amazon CloudWatch is a straightforward process and one that I encourage you to test out for yourself.
If you're using custom scripts or third-party monitoring tools, CloudWatch is a viable alternative that's worth checking out.
Join the Jar Tippers on Patreon
It takes a lot of time to write detailed blog posts like this one. In a single-income family, this blog is one way I depend on to keep the lights on. I'd be eternally grateful if you could become a Patreon patron today!Become a Patron!