☁️
CTHFM: AWS
  • Welcome
  • Getting Started
    • Account Setup
  • AWS CLI
    • AWS CLI Overview
    • Installation
  • AWS Fundamentals
    • AWS Documentation
    • AWS Shared Responsibility Model
    • Organizational Hierarchy
    • AWS Principals
    • IAM Fundamentals
      • IAM Policy Components
      • IAM Documentation References
    • AWS Security Services Overview
    • AWS Core Services
    • AWS Frameworks
    • Regions and Availability Zones
  • SQL
    • SQL Refresher for Threat Hunting
  • Logging Reference
    • Cloudtrail
      • What is Cloudtrail?
      • Setting Up Cloudtrail
      • Cloudtrail Events Structure
      • Filtering and Searching CloudTrail Logs
      • IAM ID Prefixes
      • Additional Resources
      • API References
    • VPCFlow Logs
    • GuardDuty
      • Multi-Account Setup
      • GuardDuty Concepts
      • GuardDuty Finding References
      • S3 Protection
      • Malware Protection
        • EC2 Malware Protection
          • EC2 Protection Resources
          • Monitoring Scans
          • EC2 Malware Protection Events: CloudWatch
        • S3 Malware Protection
          • Enabling S3 Malware Protection
          • After Enabling S3 Malware Protection
          • S3 Malware Resource Plan Status
          • S3 Malware Protection Quotas
      • RDS Protection Enablement
      • Lambda Protection Enablement
      • Trusted IP Lists and Threat Lists in Amazon GuardDuty
      • Remediation Recommendations
      • GuardDuty API Reference
      • GuardDuty Quotas
    • Access Analyzer
      • Setup
      • External Access and Unused Access Analyzer Findings
      • Review Findings
      • Access Analyzer Resources
      • Access Analyzer API Reference
    • AWS Network Firewall
      • Permissions
      • Firewall Log Contents
      • Logging Destinations
      • CloudWatch Firewall Metrics
    • AWS Config
      • Resource Management in AWS Config
      • AWS Config Integrations
      • AWS Config Resources
      • Configuration Item
      • Config Rules
        • Evaluation Modes
  • CloudWatch
    • Amazon CloudWatch
      • CloudWatch Concepts
      • CloudWatch Metrics
        • Filter Pattern Syntax
      • CloudWatch Alarms
        • Alarm Recommendations
      • Subscriptions
      • CloudWatch Agent
      • CloudWatch Insights
        • Supported Logs and Discovered Fields
        • CloudWatch Insights Query Syntax
      • Anomaly Detection
        • Create Anomaly Detector
        • Alarms for Anomaly Detections
      • CloudWatch Filter Syntax
      • CloudWatch Service Quota
  • Athena For Threat Hunting
    • Introduction to Athena
    • Setting Up Athena
    • SQL For Threat Hunters
    • Automated Response
    • Query Best Practices
  • AWS Security Research and Resources
    • AWS Security Blog
    • AWS Goat
    • Cloud Goat
    • Pacu
    • Prowler
    • Scout Suite
  • Threat Hunting in AWS
    • Threat Hunting in AWS
    • Threat Hunting Introduction
    • Threat Hunting Process
      • Hypothesis Generation
      • Investigation
      • Identification
      • Resolution & Follow Up
    • Pyramid of Pain
    • MITRE Att&ck
      • MITRE Att&ck Concepts
      • MITRE Att&CK Data Sources
      • MITRE Att&CK Mitigations
    • MITRE Att&ck: AWS
      • MITRE Att&CK Matrix
      • Amazon Web Services Security Control Mappings
    • AWS Threat Hunting Ideas
      • AWS Threat Hunting Ideas: EC2
      • AWS Threat Hunting Ideas: Lambda
      • AWS Threat Hunting Ideas: SQS
      • AWS Threat Hunting Ideas: SNS
      • AWS Threat Hunting Ideas: RDS
Powered by GitBook
On this page
  • SQL Basics for Threat Hunters
  • Introduction to SQL Syntax in Athena
  • Working with Athena Data Types
  • Writing Queries for Threat Hunting Use Cases
  • Handling Arrays and Structs in Athena
  • Optimizing Athena Queries for Large Datasets
  • Creating Saved Queries for Reusability
  • Practical Exercise: Building a Query from Scratch
  1. Athena For Threat Hunting

SQL For Threat Hunters

SQL Basics for Threat Hunters

This module introduces SQL concepts in the context of Amazon Athena, focusing on querying security logs stored in S3. You’ll learn SQL syntax, how to work with various data types, and best practices for building effective queries to detect suspicious activity. By the end of this module, you’ll be able to run essential queries for threat hunting.

Introduction to SQL Syntax in Athena

Athena supports standard SQL for querying data. Below are the core SQL components you need to know:

  1. SELECT: Retrieves data from a table.

  2. FROM: Specifies the table or dataset.

  3. WHERE: Filters rows based on conditions.

  4. GROUP BY: Groups rows with the same values.

  5. ORDER BY: Sorts the result set.

  6. LIMIT: Restricts the number of returned rows.

Basic Query Structure:

SELECT column1, column2 
FROM table_name 
WHERE condition 
GROUP BY column 
ORDER BY column DESC 
LIMIT 10;

Working with Athena Data Types

Athena can query data in structured (CSV), semi-structured (JSON), and columnar formats (Parquet). Threat hunters frequently encounter JSON objects, arrays, and nested structures.

Common Data Types in Athena:

  • STRING: Text values

  • BIGINT: Large integers

  • ARRAY: A list of values

  • STRUCT: Nested objects (common in JSON logs)

Example: Accessing Nested JSON Data

CloudTrail logs often store user identity information in a nested structure:

SELECT userIdentity.userName, eventName, eventTime 
FROM cloudtrail_logs 
WHERE userIdentity.type = 'IAMUser';

Writing Queries for Threat Hunting Use Cases

Here are practical SQL queries tailored for threat hunting using CloudTrail logs, VPC Flow Logs, and GuardDuty findings.

Query 1: Detect Failed Console Logins

Failed logins can indicate brute-force attempts or misconfigurations.

SELECT eventTime, userIdentity.userName, errorMessage 
FROM cloudtrail_logs 
WHERE eventName = 'ConsoleLogin' 
  AND errorCode IS NOT NULL 
ORDER BY eventTime DESC 
LIMIT 10;

Explanation:

  • Filters CloudTrail logs to show failed login attempts.

  • Orders the results by the most recent events.

Query 2: Track Large Outbound Data Transfers

Monitoring large data transfers is critical for detecting data exfiltration.

SELECT srcAddr, dstAddr, SUM(bytes) AS total_bytes 
FROM vpc_flow_logs 
WHERE action = 'ACCEPT' 
GROUP BY srcAddr, dstAddr 
HAVING SUM(bytes) > 1000000 
ORDER BY total_bytes DESC;

Explanation:

  • Groups data by source and destination IPs and sums the bytes transferred.

  • Filters to show only large data transfers (greater than 1 MB).

Query 3: Identify High-Severity GuardDuty Alerts

Quickly identify the most severe alerts for incident response.

sqlCopy codeSELECT eventType, severity, description, count(*) AS alert_count 
FROM guardduty_logs 
WHERE severity >= 7 
GROUP BY eventType, severity, description 
ORDER BY severity DESC;

Explanation:

  • Retrieves high-severity alerts from GuardDuty.

  • Groups and counts alerts based on type and description.

Handling Arrays and Structs in Athena

When analyzing nested data, such as in CloudTrail or GuardDuty logs, it’s common to work with arrays and structs. Below is an example of how to query nested data.

Example: Extracting Information from Structs

SELECT userIdentity.arn, eventSource, eventTime 
FROM cloudtrail_logs 
WHERE userIdentity.type = 'AssumedRole';

Explanation:

  • Queries the user ARN for actions performed by assumed roles.

Optimizing Athena Queries for Large Datasets

When working with massive datasets, query optimization is essential to reduce costs and improve performance.

Optimization Techniques:

  1. Use SELECT Specific Columns Avoid using SELECT * to reduce the amount of data scanned.

    sqlCopy codeSELECT userName, eventName FROM cloudtrail_logs WHERE eventName = 'CreateUser';
  2. Use WHERE Clauses Filter data as early as possible using WHERE clauses.

  3. Partitioning Data Partition logs by time to make queries faster (e.g., year/month/day).

    Example query using partition:

    SELECT eventTime, eventName 
    FROM cloudtrail_logs 
    WHERE year = 2024 AND month = 10 AND day = 14;
  4. Limit Rows Returned Use LIMIT to reduce the number of rows returned for quick inspections.

Creating Saved Queries for Reusability

In threat hunting, it’s common to reuse queries. You can save queries in Athena for quick access during investigations.

Steps to Save a Query:

  1. Run a query in the Athena Console.

  2. Click Save and give the query a meaningful name (e.g., “Failed Console Logins”).

  3. Reuse the query whenever needed without rewriting SQL.

Practical Exercise: Building a Query from Scratch

Scenario:

You suspect an insider is trying to access resources they shouldn’t. Build a query to:

  • Find all actions taken by IAM users.

  • Focus on high-privilege operations, such as CreateUser.

SELECT userIdentity.userName, eventName, eventTime 
FROM cloudtrail_logs 
WHERE userIdentity.type = 'IAMUser' 
  AND eventName = 'CreateUser';
PreviousSetting Up AthenaNextAutomated Response

Last updated 8 months ago