Skip to the content.

Logging and Monitoring


Logging options in AWS


Amazon CloudWatch

CloudWatch provides centralized logging and metrics for resources and applications.

Main components

Getting started with CloudWatch

CloudWatch Dashboards

We can build and customize a CloudWatch dashboard page using different visual widgets displaying metrics and alarms relating to your resources to form a unified view. This can be done through any of the methods below:

These dashboards can then be viewed from within the AWS Management Console. The resources within your customized dashboard can be from multiple different regions making this a very useful feature. In addition:

Once you have built your Dashboards, you can easily share them with other users, even those who may not have access to your AWS account. This allows you to share the findings gathered by CloudWatch with those who may find the results interesting and beneficial to their day-to-day operational role, but don’t necessarily require the need to access your AWS account.

Dashboards and Widgets

There are two ways that you can create a dashboard.:

Both methods allow you to pick from many different media types called widgets. There are currently 8 flavors of these widgets and they are as follows:

Line charts - A line chart is a type of chart which displays information as a series of data points connected by straight line segments. It is a basic type of chart common in many fields.

Stacked area chart -This type of chart compares the totals of many different subjects within the same graph

Number Widget - Allow you to instantly see the value for a certain metric that you’re particularly interested in - this could be as simple as displaying the current number of online instances.

Bar Charts - Compares values of multiple types of data within the same graph.

Pie charts - Proportional data in direct relationship to other information fitted within a circle.

Text widget - Free text with markdown formatting allowing you to add useful information to your dashboards as you see fit

Log tables - Explore results from log insights. Logs Insights enables you to interactively search and analyze your log data in Amazon CloudWatch

Alarm statuses - In case you have an alarm set up that you’d like to know immediately if something is going wrong right on this dashboard

Dashboard Features

One extremely cool feature of CloudWatch dashboards is they allow you to perform math on the metrics you want to display. So if you wanted to see how a graphed metric looked when applying normalization techniques or filters to your data you have the power to do so.

Additionally when working with dashboards are also allowed to aggregate data across multiple sources, like an auto scaling group for example, so if you were interested in seeing how the CPU load was handling overtime across your entire fleet you could create a dashboard that would display that.

Create visually

It is fairly painless to create dashboards with the visual dashboard creation tools provided by AWS within the CloudWatch console. Creating dashboards in the editor is as simple as drag and dropping and adding new widgets onto a blank canvas. The editor allows you to pick any of the previously mentioned different types of media widgets and place them where you please. Pieces are rearrangeable and can be placed with as much finite controls as you desire. all widgets have a stretchable window view that you can position into specific sizes.

Create programmatically

Dashboards can also be written as code giving you programmatic access to all the same information and tools. This means you can also put these code snippets inside cloud formation templates for easy dashboard creation on new accounts or projects. Creating these codified dashboards however is not as easy as it may sound at first. There is a lot of work that goes into testing and making sure your creation functions well.

Your dashboard code is written as a string in JSON formatting and can include anywhere between 0 to 100 separate widget objects. You have to specifically note down the x,y location of your widgets as well as the width and height of each element. That can be a little tedious to set up for the first time, but if you already have a functional blueprint, you can modify that fairly easily.

Annotations

When you’re building your charts and after you have them completed you have the ability to add annotations to your graphs. This is helpful for displaying when a certain event has taken place in the past which could help give other members of your team insight and exposure to certain peaks and valleys in your information. Just like writing good code requires comments it’s especially important to make sure your graphs and charts also have that advantage.

You can have both horizontal and vertical annotations in your graphs - each having their own purpose. For example, horizontal annotations can denote reasonable top and bottom bounds for a service’s CPU load while vertical annotations are great for noting when a specific event happened in the past.

Linking Dashboards

You also have the ability to link to other dashboards within your own systems or even across accounts. These dashboards don’t have to be in the same region either. This is a very powerful tool that helps to centralize operations teams, DevOps, and other service owners who all need to have visibility into the status of your applications.

In order to allow cross-account and cross-region access, you need to enable it within the CloudWatch settings for your account as well as each of the accounts you wish to connect to. You can then link your accounts together, to share CloudWatch data between. These settings can also be activated within the AWS SDK and CLI.

Few options on sharing:

Limits

CloudWatch Dashboards allow you to have up to three dashboards - each containing up to 50 metrics at no charge. This is more than enough for anyone just practicing or having a few applications they want to monitor. For any more than that however, you will be charged $3 per month per new dashboard you wish to create.

For an enterprise company, that is not too much to spend. However If you are a solo developer or a small shop just starting off - those little 3 dollar charges can add up.. So make sure you use your resources appropriately when building dashboards for your services.

Best Practices

CloudWatch Metrics

Metrics are a key component and fundamental to the success of Amazon CloudWatch, they enable you to monitor a specific element of an application or resource over a period of time while tracking these data points. Example of metrics include:

By default when working with Amazon CloudWatch, everyone has access to a free set of Metrics, and for EC2, these are collated over a time period of 5 minutes. However, for a small fee, you can enable detailed monitoring which will allow you to gain a deeper insight by collating data across the metrics every minute.

In addition to detailed monitoring, you can also create your own custom metrics for your applications, using any time-series data points that you need, but be aware that when you create a metric they are regional, meaning that any metrics created in 1 region will not be available in another.

Anomaly Detection

CloudWatch metrics also allow you to enable a feature known as anomaly detection. This allows CloudWatch to implement machine learning algorithms against your metric data to help detect any activity that sits outside of the normal baseline parameters that are generally expected. Advance warning of this can help you detect an issue long before it becomes a production problem.

CloudWatch Alarms

Amazon CloudWatch Alarms tightly integrate with Metrics and they allow you to implement automatic actions based on specific thresholds that you can configure relating to each metric. Examples include:

Alarm States

There are 3 different states for any alarm associated with a metric:

Integration

CloudWatch alarms are also easily integrated with your dashboards as well, allowing you to quickly and easily visualize the status of each alarm. When an alarm is triggered into a state of ALARM, it will turn red on your dashboard, giving a very obvious indication.

CloudWatch EventBridge

CloudWatch EventBridge is a feature that has evolved from an existing feature called Amazon Events. CloudWatch EventBridge provides a means of connecting your own applications to a variety of different targets, typically AWS services, to allow you to implement a level of real-time monitoring, allowing you to respond to events that occur in your application as they happen.

But what is an event? Basically, an event is anything that causes a change to your environment or application.

Benefits of using CloudWatch EventBridge:

Rules

A rule acts as a filter for incoming streams of event traffic and then routes these events to the appropriate target defined within the rule. The rule itself can route traffic to multiple targets, however the target must be in the same region.

Targets

Targets are where the events are sent by the rules. All events received by the target are done in a JSON format. Here are a few targets that can be used as a destination for events:

For the latest list of targets, please see the relevant documentation here: https://docs.aws.amazon.com/eventbridge/latest/userguide/eventbridge-targets.html

Event Buses

An Event Bus is the component that actually receives the Event from your applications and your rules are associated with a specific event bus. CloudWatch EventBridge uses a default Event bus that is used to receive events from AWS services, however, you are able to create your own Event Bus to capture events from your own applications.

CloudWatch Logs

CloudWatch Logs gives you a centralized location to house all of your logs from different AWS services that provide logs as an output, such as CloudTrail, EC2, VPC Flow logs, etc, in addition to your own applications.

When log data is fed into Cloudwatch Logs, you can utilize CloudWatch Log Insights to monitor the logstream in real time and configure filters to search for specific entries and actions that you need to be alerted on or respond to. This allows CloudWatch Logs to act as a central repository for real-time monitoring of log data.

Unified CloudWatch Agent

This be installed to collect logs and additional metric data from EC2 instances as well from on-premise services running either a Linux or Windows operating system. This metric data is in addition to the default EC2 metrics that CloudWatch automatically configures for you.

CloudWatch Insights

CloudWatch Insights provide the ability to get more information from the data that CloudWatch is collecting. There are currently three different types of insights within CloudWatch:

Log Insights

This is a feature that can analyze your logs that are captured by CloudWatch Logs at scale in seconds using interactive queries delivering visualizations that can be represented as:

The versatility of this feature allows you to work with any log file formats that AWS services or your applications might be using.

Using a flexible approach, you can use Log insights to filter your log data to retrieve specific data allowing you to gather insights that you are interested in. Also using the visual capabilities of the feature, it can display them in a visual way.

Container Insights

Much like Log insights, Container Insights allow you to collate and group different metric data from different container services and applications within AWS, for example, the Amazon Elastic Kubernetes Service, (EKS) and the Elastic Container Service (ECS).

Lambda Insights

This feature provides you the opportunity to gain a deeper understanding of your applications using AWS Lambda. It gathers and aggregates system and diagnostic metrics related to AWS Lambda to help you monitor and troubleshoot your serverless applications.

To enable Lambda Insights, you need to enable the feature per Lambda function that you create within Monitoring Tools section of your function:

This ensures that a CloudWatch extension is enabled for your function allowing it to collate system-level metrics which are recorded every time the function is invoked.


AWS CloudTrail

CloudTrail automatically records user activity and deliver those logs for you.

Who did what and when

Log File Integrity

We can verify that log files have remain unchanged since CloudTrail delivered them to the S3 bucket.

Note that verification of the log file integrity can only be achieved via programmatic access and not through the console. This can be done through AWS CLI:

aws cloudtrail validate-logs --trail-arn <trailARN> --START-TIME <start-time> 

We can also add additional parameters:

aws cloudtrail validate-logs --trail-arn <trailARN> --START-TIME <start-time> \
--end-time <end-time> \
--s3-bucket <bucket-name> \
--s3-prefix <prefix> \
--verbose 

Digest file folder structure in the S3 bucket:

S3-bucket-name/AWSLogs/accounID/CloudTrail-Digest/Region/digest-end-year/digest-end-month/digest-end-date/ 

CloudTrail Process Flow

  1. Create a Trail.
  2. Specify an S3 bucket for log storage.
  3. Optional - Encrypt log files with KMS.
  4. Optional - Notifications of new log files via SNS.
  5. Optional - Enable log file validation.
  6. Once trail is created, we can add configuration change.
  7. Optional - Deliver CloudTrail logs to CloudWatch for monitoring.
  8. Optional - Configure Event Selector for Management/Data
  9. Optional - Add any required tags.
  10. Configuration is complete.

Once data is captured, we can find particular events quickly through the use of API Activity Filters.

Lifecycle of an API call in CloudTrail

  1. IAM user or service calls an API.
  2. CloudTrails checks if the API call matches any configured trail.
  3. If a match is found, API call is recorded as an event on the log file.
  4. Event on log file can be delivered to an S3 bucket or CloudWatch Logs.
  5. In the 3 bucket, log files are sotred and encrypted by default by SS3 unless KMS is configured.
  6. If lifecycle rules are configured, log files may be stored on a different storage class or AWS Glacier.

CloudTrail Permissions

Currently there are two AWS Managed policies for CloudTrail:

Custom permissions can be created by creating a new IAM policy and applying some of the permissions instead fo providing full access to CloudTrail.

KMS adds another layer of ecnryption to Log files, in addition to the default encryption that uses SS3-S3 encryption. If the logs in the S3 bucket have been encrypted using KMS, specific permissions are needed to decrypt the logs:

Note that the KMS key and bucket needs to be in the same region.

CloudTrail Logs

CloudTrail Trails

Without a Trail, AWS CloudTrail is unable to capture API calls.

CloudTrail Log Files

Log Files are written in JSON format and new log files are created every 5 mins.

Log file naming convention:

AccountID_CloudTrail_RegionName_YYYYMMDDTHHmmZ_UniqueString.FilenameFormat 

As for the S3 bucket where the log files are stored, it also follows an S3 Bucket structure:

BucketName/prefix/AWSLogs/AccountID/CloudTrail/RegionName/YYYY/MM/DD 

Log Aggregation to a Single Account

Logs from multiple accounts can be aggregated to a single S3 bucket in one of the accounts.

  1. Configure a new Trail in your primary AWS account.
  2. Apply permissions to S3 bucket allowing cross-account access.
  3. Edit the resource attribute of bucket policy and add the accounts that need access to the bucket.

     "Resource": {
         "arn:aws:s3:::bucket-name/[optional]logFilePrefix/AWSLogs/111111111111",
         "arn:aws:s3:::bucket-name/[optional]logFilePrefix/AWSLogs/222222222222",
         "arn:aws:s3:::bucket-name/[optional]logFilePrefix/AWSLogs/333333333333"
     }
    
  4. Create a new trail in the secondary AWS account and use a bucket from a different account.

  5. Once trail is created, logs will now be delivered to the same S3 bucket in your primary account.

Accessing Cross-Account Log Files

For users/administrators in the secondary accounts to access the log files that are aggregated to the S3 bucket in the primary account, we need to configure a few elements in IAM:

  1. In the primary account. create IAM roles for each of the AWS account.
  2. Assign access policy to each Role that allows only a specific Account access.
  3. Users in the requesting account will need to assume one of these Roles for their corresponding AWS account log files.

Monitoring

Common monitoring use-cases:

CloudTrail + CloudWatch Process:

  1. Log file sent to S3 and CloudWatch log group (if configured)
  2. CloudTrail assumes Role with permission to run two CloudWatch APIs:

    • CreateLogStream
    • PutLogEvents

Default IAM role created by cloudtrail:

CloudTrail_CloudWatchLogs_Rule 

CloudWatch Configuration:

Similarities with other AWS services


AWS Config

Common resource management questions:

AWS Config is designed to record and capture resource changes within your environment, allowing you to perform a number of actions against the data that helps to find answers to the questions that we highlighted previously. Main features include:

AWS Config is region-specific, which means that if you have resources in multiple regions, you will have to configure AWS Config for each region you want to record resource changes for. When doing so, you are able to specify different options for each region.

For services that are not region-specific such as IAM, there is also an option to record global-scoped resources.

Use Cases

Security Compliance. AWS Config can be a great tool, when enforcing strict compliance against specific security controls.

Discovery of Resources. When you first activate AWS Config, or run the configuration recorder, AWS Config will discover all supported resources types, allowing you to view them from within the AWS Config dashboard.

Audit Compliance. As well as using AWS Config for being compliant for internal security standards, there are also many external audit and governance controls, where the service can also enforce specific controls on resources to maintain compliance.

These programs require strict controls in many different areas. Being able to set custom and manage configurals in place help adhere to these external governance controls. In addition to this, you could show the auditors all of your configuration history files, which will allow them to go back to any point in time to check the configuration of any of your supported resources.

Resource Change Management. When planning changes within your infrastructure, it’s often required that you have an understanding of what affect the change will have on other resources.

Troubleshooting and Problem Management. AWS Config is a great tool to help you troubleshoot issues, that may arise within your environment.

Key Components

The following identifies the main components to the AWS Config service:

AWS resources

These are objects that can be created, updated, and deleted from within the Management console or programmatically through the AWS CLI or SDKs.

Configuration Items

A configuration item or CI is comprised of a JSON file that holds the configuration information, relationship information, and other metadata as a point-in-time snapshot view of a supported resource.

Sections of a Configuration Item:

Configuration Streams

When new CIs are created, they are sent to a Configuration Stream which is in a form of an SNS topic. This stream is used on events like:

The SNS topic can have different notification endpoints;

Configuration History

The configuration history uses configuration items to collate and produce a history of changes to a particular resource. This allows you to see the complete set of changes made to a resource over a set period of time.

The information can be accessed AWS CLI using the following command:

aws configservice get-resource-config-history 

This could also be accessed via the AWS Management Console. A configuration history file for each resource type is sent to a S3 bucket that is selected during the set up of AWS Config.

Configuration Snapshots

The configuration snapshot takes a point-in-time snapshot of all supported resources configured for that region. It will then generate CIs for each resource in your AWS account for a specific region, and this configuration snapshot can then be sent to an S3 bucket. Alternatively, this information can be viewed via the AWS Management Console.

Configuration Recorder

This can be seen as the engine of the service as it is responsible for recording all of the changes to the supported resources and generating the configuration items.

Config Rules

AWS config rules enforce specific compliance checks and controls across your resources, and allows you to adopt an ideal deployment specification for each of your resource types.

It’s important to note that marking a resource as non-compliant does not mean the resource will be taken out of service or it will stop working. It will continue to operate exactly as it is with its new configuration.

AWS Config simply alerts you that there is a violation, and it’s up to you to take the appropriate action.

Examples of predefined rules that AWS have created:

When creating or modifying rules:

Resource Relationships

AWS Config identifies relationships with other resources from a specific resource. As an example, it might be the EC2 instance that the volume is attached to.

SNS Topics

An SNS topic is used as a configuration stream for notifications of various events triggered by AWS Config. You can have various endpoints associated to the SNS stream. Best practice indicates that you should use SQS and then programmatically analyze the results via SQS.

S3 Bucket

The S3 bucket that was selected at the time of configuration is used to store all the configuration history files that are generated for each resource type, which happens every six hours. Also, any configuration snapshots that are taken are also stored within the same S3 bucket.

The configuration details used for both SNS and S3 are classed as the AWS Config delivery channel by which data can be sent to other services.

AWS Config Permissions

When setting up AWS Config, you’re required to select an IAM role. This role is required to allow AWS Config to obtain the correct permissions to carry out and perform a number of functions.

For example, AWS Config will need read-only access to all the supported resources within your account so it can retrieve data for the configuration items. Also, we now know that AWS Config uses SNS and S3 both for streams and storage of the configuration history files and snapshots. So AWS Config requires the relevant permission to allow it to send data to these services.


AWS Inspector

It is an automated security assessment service that can help you improve the security and compliance of applications deployed in AWS.

Assessments are based on best practices and known security weaknesses covering:

Agent Based

Amazon Inspector requires software agents to be installed on any EC2 instance that you want to assess. This makes it an easy service to be configured and added at any point to existing resources already running within your AWS infrastructure. This helps Amazon Inspector to become a seamless integration with any of your existing security processes and procedures as another level of security.

Types of assessments

Assessment report sample

How to get started

Key Components

Amazon Inspector has the following components and elements:

Amazon Inspector Role

When you first start using Amazon Inspector, you are required to create or select a role to allow Amazon Inspector to have read-only access to all of your EC2 instances. Without this role, the service would not have the relevant permissions to be able to gather the telemetry data of the instance during any assessment runs.

If you allow Amazon Inspector to create a role, then it will have a policy attached as detailed below:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeInstances"
            ],
            "Resource": [
                "*"
            ]
        }
    ]
} 

This allows the role to have read-only access to all EC2 instances within your AWS account. For more information on IAM and IAM roles, please see our existing course here. Assessment Targets.

Assessment Targets

An Assessment Target is a grouping of AWS EC2 instances that you want to run an assessment against. This grouping of EC2 instances are managed and defined by the tags that are associated to your EC2 instance. Tagging is simply a way of adding metadata to your instances to help with management and organization, consisting of a key value pair.

When creating an assessment target, you are asked to select which keys from your tags that you would like to include within your Assessment Target. You can also refine your selection even further by providing the values for each of those keys, too.

The EC2 instances are not required to contain both keys to be included within this Assessment Target. Only a match of one key is necessary. AWS Agents.

AWS Agents

AWS Agents are software agents that must be installed on EC2 instances that you with to monitor and run the assessments on. Without this agent, Amazon Inspector would not be able to perform the analysis that it needs to.

Once installed, the agent will be able to track and monitor data across the network file system, and any process activity of the instance. This data is then recorded as telemetry data, and is fed back to the Amazon Inspector service via the public endpoint of the service over a TLS-protected channel (Transport Layer Security).

A regular heartbeat is sent from the agent to Inspector, which the Inspector service will respond to with instructions, such as to perform an assessment at a particular time.

As the Agent is software-based, it is necessary from time to time to update the agent with the latest version. These new updates are managed and automatically installed by AWS, and so you don’t need to worry about the latest Agent software version.

Assessment Templates

An assessment template defines a specific configuration as to how an assessment is run on your EC2 instances. These configurable items within the template include the following.

Once your assessment template is created, you are not able to modify it again to make changes. You can, however, tag your assessment templates to help with the organization and management of your assessment runs.

Rules Packages

When Amazon Inspector gathers telemetry during an assessment run, it will then compare this data against specific security rules to ascertain its compliance. And these rules are grouped together in what is known as a rules package. A rules package contains a number of individual rules that are each checked against the telemetry data back from the EC2 instance.

Each rule will also have an associated severity which will be one of the following:

The rule packages themselves are split across four different categories, these being:

Common Vulnerabilities and Exposures The CVE is a publicly-known reference list of security threats that are well-documented. The rules used within this package will check the Assessment Target for exposure to any known security holes that would compromise the integrity, confidentiality, and availability of your EC2 instance.

Should any findings from an assessment be found against a CVE, it’s recommended you visit this site and search for specific vulnerability ID to gather additional detailed information to help you resolve and mitigate the issue. To check which CVEs the rules within the rules package are performing an assessment against, you can visit the following link.

https://cve.mitre.org/ 

As new CVEs are found, they are added to this list by AWS, and the corresponding rules added to the rules package, preventing the need for you to stay up-to-date with the latest known security issues.

Center for Internet Security Benchmarks These benchmarks are continuously refined and used as global standards for best practices for protecting data and IT resources. AWS is a CIS Benchmarks member company, and Amazon Inspector’s associated certifications can be found here:

https://www.cisecurity.org/partner/amazon-web-services/ 

The rules within this rule package help to assess security for the various operating systems. If any findings are made against this rules package, then similarly to the CVE list, you can visit the provided link to download the detailed description, explanation, and advice on how to mitigate the security issue found.

Security Best Practices This rules package looks for weaknesses in common security best practices. However, this only applies to Assessment Targets that are running the Linux operating system. At this stage, it’s not possible to run this rules package on any target that has the marks of Windows OS. The following security checks are covered within this rules package.

Assessment Run

As assessment run can happen once you have configured your Amazon Inspector role, installed the agents and configured your Assessment Target and Assessment Templates. Once these components are in place, you are then able to run the configured assessment on your assessment targets. This process is known as the assessment run.

During this time, telemetry data will be sent back to Amazon Inspector and S3 to assess the data against the specified rules packages defined within the assessment template

Telemetry

Telemetry is a data that is collected from an instance, detailing its configuration, behavior and processes during an assessment run.

Once collected, the data is then sent back to Amazon Inspector in near-real-time over TLS where it is then stored and encrypted on S3 via an ephemeral KMS key. Amazon Inspector then accesses the S3 Bucket, decrypts the data in memory, and analyzes it against any rules packages used for that assessment to generate the findings.

After 30 days, this telemetry data is then deleted using a lifecycle policy attached to the dedicated Amazon Inspector S3 Bucket.

Assessment Reports

On completion of an assessment run, it is possible to generate an assessment report which provides details on what was assessed, and the results of that assessment.

As this feature was only released at the end of April 2017, It’s only possible to generate these reports for any assessment runs that were completed on or after the 25th of April, 2017. There are two different types of reports that you can generate:

Findings

Findings are generated from the results of an assessment run. A finding is a potential security issue or risk against one of your EC2 instances within the assessment target. For each finding, an explanation of the issue is given, along with guidance on how to remediate the problem.

Service Limitatins

These are the service limitations per account. Note that you can raise a request to AWS to increase the limits .


Athena

Athena is a serverless interactive query service which makes it easy to search and analyze data in AWS S3 using SQL.

How to get started


AWS GuardDuty

It is an intelligent threat detection service that uses AI/Machine Learning to monitor one or more AWS accounts for malicious behavior.

Use-cases

How to get started


AWS Trusted Advisor

This AWS service provides guidance on how to provision resources following AWS best practices.

The type of AWS account support plan in place determines how many checks AWS Trusted Advisor will perform. All AWS accounts benefit from six Trusted Advisor checks, while accounts with Business or Enterprise support plans have access to over 50 Trusted Advisor checks. Business support plans start at $100 per month.

In the Trusted Advisor, we can see the recommendations in each of the four categories checked by Trusted Advisor.

The six checks included without a support plan fall under the Performance and Security categories. Under each category, the number of checks that fall into each recommendation status category are shown. The recommendation statuses by color are:

We can also see the recommended actions (if there are any):

Trusted Advisor will automatically perform all of the checks without manual intervention. This feature is useful because we can trigger CloudWatch Events to send us emails when the status of a check changes. However, the intervals for each check vary greatly.

We can easily get the latest check results in the AWS Management Console by clicking the refresh all button.

Similarly, if we want to export a report with all check results at once, the download all results button is available.

Categories

Security Core checks:

Features

For every check that Trusted Advisor provides, you will see:

Security Groups - Specific Ports Unrestricted

The check looks for unrestricted access to ports on inbound traffic. Any unrestricted port is given a status according to the following rules: