Anomaly Detection

Overview

CloudWatch Logs Anomaly Detection leverages machine learning and pattern recognition to detect anomalies in log events. It identifies unusual behaviors by establishing baselines based on past log patterns and monitoring deviations from expected trends.

Key Features:

Anomaly Detection Process:
- Uses two weeks of historical logs for initial training, which takes up to 15 minutes.
- Analyzes incoming logs continuously to identify and display anomalies in the CloudWatch Logs console.
- Automatically extracts patterns by recognizing static and dynamic content (e.g., timestamps or request IDs).
Pattern Recognition & Tokens:
- Log patterns are expressed using tokens; dynamic tokens, such as varying request IDs, are represented as <*>.
- Example Pattern:
  <*> <*> [INFO] Calling DynamoDB to store for resource id <*>
- Patterns simplify the analysis of large log sets, compressing many log events into a few patterns.
Anomaly Visibility & Suppression:
- Default visibility period for anomalies is 21 days (can be adjusted). After this period, recurring anomalies are accepted as normal behavior.
- Users can suppress anomalies temporarily or permanently to prevent unnecessary alerts. Suppression can apply to individual anomalies or entire patterns.
Severity & Priority:
- Severity is based on keywords like ERROR, WARN, FATAL; logs without such keywords have a severity of NONE.
- Priority considers both severity and deviation from expected values. For example, a 500% spike in a token value could trigger a HIGH priority anomaly.
Suitable Log Types:
- Best suited for application logs with typical patterns and log levels (INFO, ERROR).
- Less effective for network or access logs (e.g., VPC flow logs) and long JSON logs (e.g., CloudTrail logs). Pattern analysis is limited to the first 1500 characters of each log event.
Privacy & Data Usage:
- Anomaly detection models are trained per log group and account-specific; AWS does not use customer logs to train models for other customers.

Examples of Flagged Anomalies:

Unseen Patterns: A log event with a new structure.
Pattern Deviations: Unexpected changes in a known pattern.
New Token Values: A dynamic token value outside its usual set.
Sudden Spikes: A sharp increase in token occurrences (e.g., a surge in HTTP 200 responses).

PreviousCloudWatch Insights Query Syntax NextCreate Anomaly Detector

Last updated 8 months ago

Overview

Key Features:

Anomaly Detection Process:

Uses two weeks of historical logs for initial training, which takes up to 15 minutes.
Analyzes incoming logs continuously to identify and display anomalies in the CloudWatch Logs console.
Automatically extracts patterns by recognizing static and dynamic content (e.g., timestamps or request IDs).

Pattern Recognition & Tokens:

Log patterns are expressed using tokens; dynamic tokens, such as varying request IDs, are represented as <*>.

Example Pattern:

<*> <*> [INFO] Calling DynamoDB to store for resource id <*>

Patterns simplify the analysis of large log sets, compressing many log events into a few patterns.

Anomaly Visibility & Suppression:

Default visibility period for anomalies is 21 days (can be adjusted). After this period, recurring anomalies are accepted as normal behavior.
Users can suppress anomalies temporarily or permanently to prevent unnecessary alerts. Suppression can apply to individual anomalies or entire patterns.

Severity & Priority:

Severity is based on keywords like ERROR, WARN, FATAL; logs without such keywords have a severity of NONE.
Priority considers both severity and deviation from expected values. For example, a 500% spike in a token value could trigger a HIGH priority anomaly.

Suitable Log Types:

Best suited for application logs with typical patterns and log levels (INFO, ERROR).
Less effective for network or access logs (e.g., VPC flow logs) and long JSON logs (e.g., CloudTrail logs). Pattern analysis is limited to the first 1500 characters of each log event.

Privacy & Data Usage:

Anomaly detection models are trained per log group and account-specific; AWS does not use customer logs to train models for other customers.

Examples of Flagged Anomalies:

Unseen Patterns: A log event with a new structure.

Pattern Deviations: Unexpected changes in a known pattern.

New Token Values: A dynamic token value outside its usual set.

Sudden Spikes: A sharp increase in token occurrences (e.g., a surge in HTTP 200 responses).