CloudWatch Alarms
Overview
Amazon CloudWatch alarms allow you to monitor metrics or mathematical expressions and trigger actions based on threshold violations. Alarms help you automate responses, send notifications, and track resource health by evaluating metrics over time.
Types of CloudWatch Alarms:
Metric Alarms:
Monitors a single metric or math expression.
Triggers actions like notifications, EC2 actions (stop, reboot), Auto Scaling, or creating incidents in Systems Manager.
Composite Alarms:
Combines multiple metric or composite alarms using rules.
Reduces alert noise by triggering only when multiple alarms are in the alarm state.
Supports SNS notifications and Systems Manager actions, but not EC2 or Auto Scaling actions.
Alarm States:
OK: The metric is within the defined threshold.
ALARM: The metric exceeds the threshold.
INSUFFICIENT_DATA: Not enough data is available to determine the state.
Alarm Configuration Parameters:
Period: Time window to evaluate the metric (e.g., 1 minute, 5 minutes).
Evaluation Periods: Number of periods to evaluate.
Datapoints to Alarm: Number of data points that must breach the threshold to trigger the alarm. This supports M out of N alarms (e.g., 2 out of 3 data points must breach the threshold).
Alarm Actions:
SNS Notifications: Notify applications or users.
Lambda Functions: Automate custom actions.
EC2 Actions: Stop, reboot, or terminate instances.
Auto Scaling: Scale resources up or down.
Systems Manager: Create OpsItems or incidents for management.
Key Features:
High-Resolution Alarms: Supports periods of 10 or 30 seconds for high-frequency metrics.
Alarms on Math Expressions: Monitor derived metrics from multiple metrics.
Cross-Account Alarms: Monitor metrics across AWS accounts (except for composite alarms).
Handling Missing Data:
Not Breaching: Treat missing data as within the threshold.
Breaching: Treat missing data as violating the threshold.
Ignore: Keep the current alarm state.
Missing: Set the alarm to INSUFFICIENT_DATA state.
Using Composite Alarms to Reduce Noise:
Combine multiple alarms and trigger only when all underlying alarms breach thresholds.
Example: A composite alarm could go into the ALARM state only when both "CPUUtilization" and "MemoryUsage" alarms are in the alarm state.
Alarm Management:
Dashboards: Visualize alarms with color-coded states (gray: INSUFFICIENT_DATA, red: ALARM).
Favorites: Bookmark recently visited or important alarms.
Testing: Use the CloudWatch console or AWS CLI to simulate alarm states for testing.
Best Practices for Avoiding False Alarms:
Use M out of N configurations to avoid triggering alarms due to transient data fluctuations.
Configure alarms to treat missing data appropriately to prevent false positives.
Avoid premature transitions to alarm state by using shorter evaluation periods.
Last updated