Chapter 12: Search, Logs, and Observability with OpenSearch

Learning Objectives

Explain how Amazon OpenSearch indexes documents using inverted indexes, shards, and replicas to enable fast search across log corpora.
Build a log ingestion pipeline using Amazon OpenSearch Ingestion (managed Data Prepper), with sources, processors, and sinks defined in YAML.
Use OpenSearch Dashboards to explore logs, configure alerts, detect anomalies, and analyze distributed traces for application performance monitoring.
Apply hot, UltraWarm, and cold tiering through Index State Management (ISM) policies to dramatically reduce log retention costs while preserving queryability.

Every Lambda function, Glue job, Kinesis consumer, and microservice produces operational telemetry — logs, metrics, and traces — that must be searched, alerted on, and retained. This chapter examines the search and observability stack built around Amazon OpenSearch Service and shows how to control the cost curve as log volume grows from megabytes to petabytes.

Chapter Architecture Overview

flowchart LR Producers["Apps / Lambda /
Glue / ECS"] --> Ingest{"Ingestion
Path"} Ingest --> OSIS["OpenSearch
Ingestion (OSIS)"] Ingest --> DP["Self-hosted
Data Prepper"] Ingest --> FH["Amazon Data
Firehose"] OSIS --> Domain["OpenSearch Domain
(hot tier)"] DP --> Domain FH --> Domain Domain -- "ISM 7d" --> UW["UltraWarm
(S3 + cache)"] UW -- "ISM 30d" --> Cold["Cold
(S3 archive)"] Cold -- "ISM 90d" --> Delete["Deleted"] Domain --> Dash["OpenSearch
Dashboards"] Dash --> Alerts["Alerting +
Anomaly Detection"] Dash --> Trace["Trace
Analytics"]

Section 1: OpenSearch Fundamentals

Pre-Reading Check — OpenSearch Fundamentals

1. What core data structure makes OpenSearch text search fast?

A) A B-tree on the document ID column

A forward index from doc IDs to terms

An inverted index from terms to doc IDs

A row-store with bitmap filters

2. Which property of a shard count is fixed once an index is created?

Replica count

Primary shard count

Refresh interval

Field mappings

3. Why does OpenSearch place a primary shard and its replica on different nodes?

To save disk space

To enable automatic reindexing

To survive a node failure and continue serving traffic

It is purely cosmetic; placement does not matter

4. Roughly what shard size range does AWS recommend for log workloads?

Less than 1 GB per shard

10 to 50 GB per shard

100 to 500 GB per shard

There is no recommended range

5. OpenSearch is a fork of which projects?

Solr and Lucene

Elasticsearch 7.10 and Kibana

CloudSearch and CloudWatch

PostgreSQL full-text search

Inverted Indexes

The core data structure inside OpenSearch is the inverted index. A normal forward index maps a document ID to its content; an inverted index flips that and maps each term to the list of document IDs that contain it. Consider two log lines: "Beauty is in the eye of the beholder" and "Beauty and the beast". The inverted index stores entries like beauty -> [1, 2], beholder -> [1], and beast -> [2]. When you query for "beauty", OpenSearch performs a single dictionary lookup rather than scanning every document — which is what makes search across billions of log lines feel interactive.

Analogy: the index at the back of a textbook. To find every page mentioning "shard," you look up the term and jump to the listed pages rather than reading the whole book.

Animation 1: Inverted Index Lookup

A query for "beauty" looks up the term once and jumps directly to the matching documents — no full scan.

Documents are JSON objects, indexes are collections of documents, and the mapping defines field types. The OpenSearch vocabulary maps cleanly onto a relational mental model:

Relational	OpenSearch
Database	Cluster
Table	Index
Row	Document
Column	Field
Schema	Mapping
Primary Key	`_id` field

Shards: Primaries and Replicas

A single index quickly outgrows a single machine, so OpenSearch partitions each index into shards. A shard is an independent Lucene index. Indexes have primary shards (the authoritative copies of partitioned data) and replica shards (full copies placed on different nodes for fault tolerance and read scaling).

Routing works as follows. When you index a document, OpenSearch hashes the document ID (or a custom routing key) modulo the primary shard count, picks a primary shard, writes the document, then replicates to each replica. With 5 primaries and 1 replica, a single bulk write touches 10 shards. A search request goes to either a primary or replica copy of each shard — only 5 shards are queried per request, in parallel.

Crucial constraint: primary shard count is immutable after index creation. Replica count, however, is dynamic — you can scale replicas up before a known traffic spike and back down afterwards.

Animation 2: Shard Placement and Replica Failover

Six data nodes hosting an index with 3 primaries (P0/P1/P2) and 3 replicas (R0/R1/R2). Watch Node A fail and R0 promote to primary.

Cluster, Node, and Replica Concepts

A cluster is the unit of management. Within it, cluster manager nodes (formerly master) maintain cluster state, data nodes hold shards, and coordinator nodes fan out client requests. Amazon OpenSearch Service abstracts most of this — you pick instance types and counts, AWS picks the role assignments.

Shard sizing has practical bounds. AWS recommends 10-30 GB per shard for search-dominated workloads and up to 50 GB for log workloads. Each open shard consumes file handles, JVM heap, and cluster-state metadata, so AWS suggests fewer than 20 shards per GB of heap, or roughly fewer than 1,000 shards per node. Too few shards and individual shards bloat past 50 GB; too many and the cluster manager spends all its time tracking metadata.

For log workloads, the best practice is time-based rolling indexes managed by an index template. Daily indexes named logs-2026.05.07, logs-2026.05.08 let you delete old indexes wholesale (cheap) instead of deleting documents from a huge single index (expensive).

PUT /_index_template/logs-template
{
  "index_patterns": ["logs*"],
  "template": {
    "settings": {
      "index": {
        "number_of_shards": 3,
        "number_of_replicas": 1
      }
    }
  }
}

OpenSearch vs. Elasticsearch Fork History

OpenSearch began as Elasticsearch and Kibana. In January 2021, Elastic changed the licensing from Apache 2.0 to a dual SSPL/Elastic License model that restricted use by certain managed-service providers. AWS forked the last Apache-2.0 versions and renamed them OpenSearch and OpenSearch Dashboards. OpenSearch 1.0 GA shipped in mid-2021 under Linux Foundation governance.

Query DSL, document model, REST API, and most plugins are compatible with the Elasticsearch 7.10 era.
Codebases have diverged since 2021. Elasticsearch added ESQL; OpenSearch added neural search, ML Commons, security analytics.
New AWS deployments should choose OpenSearch unless a legacy app pins a specific Elasticsearch version.

Figure 12.1: Primary and Replica Shard Topology (6-node cluster)

flowchart TB subgraph Cluster["OpenSearch Cluster: logs-2026.05.07"] direction LR subgraph Primaries["Primary Shards"] direction LR NA["Node A
P0"] NB["Node B
P1"] NC["Node C
P2"] end subgraph Replicas["Replica Shards"] direction LR ND["Node D
R0"] NE["Node E
R1"] NF["Node F
R2"] end end NA -. replicates to .-> ND NB -. replicates to .-> NE NC -. replicates to .-> NF Client["Indexing
Request"] -->|hash mod 3| NA Client -->|hash mod 3| NB Client -->|hash mod 3| NC Search["Search
Request"] --> NA Search --> NE Search --> NF

Key Points: OpenSearch Fundamentals

Inverted index maps terms to doc IDs — the lookup is O(1) on the term, not O(n) on documents.
Primary shard count is immutable at index creation; replica count is dynamic.
Primaries and replicas live on different nodes for HA and read scaling.
Target 10-50 GB per shard, fewer than ~20 shards per GB of heap.
Use time-based rolling indexes + index templates for log workloads — deleting an index is cheap; deleting documents is expensive.
OpenSearch is the Apache-2.0 fork of Elasticsearch 7.10 from 2021; new AWS deployments should pick OpenSearch.

Post-Reading Check — OpenSearch Fundamentals

1. What core data structure makes OpenSearch text search fast?

A) A B-tree on the document ID column

A forward index from doc IDs to terms

An inverted index from terms to doc IDs

A row-store with bitmap filters

2. Which property of a shard count is fixed once an index is created?

Replica count

Primary shard count

Refresh interval

Field mappings

3. Why does OpenSearch place a primary shard and its replica on different nodes?

To save disk space

To enable automatic reindexing

To survive a node failure and continue serving traffic

It is purely cosmetic; placement does not matter

4. Roughly what shard size range does AWS recommend for log workloads?

Less than 1 GB per shard

10 to 50 GB per shard

100 to 500 GB per shard

There is no recommended range

5. OpenSearch is a fork of which projects?

Solr and Lucene

Elasticsearch 7.10 and Kibana

CloudSearch and CloudWatch

PostgreSQL full-text search

Section 2: Log Ingestion and Visualization

Pre-Reading Check — Log Ingestion and Visualization

1. What is OpenSearch Ingestion (OSIS)?

A self-hosted Logstash distribution

An auto-scaling, AWS-managed Data Prepper service

A Kinesis Data Streams replacement

A CloudWatch Logs subscription filter

2. What are the four logical stages of a Data Prepper pipeline?

producer, broker, consumer, sink

source, buffer, processor, sink

extract, transform, load, archive

map, reduce, shuffle, persist

3. When should you choose Firehose-to-OpenSearch over OSIS?

When you need grok parsing of unstructured logs

When you need conditional routing to multiple indexes

When you have already-structured records and want a simple managed delivery path

When you need on-premises ingestion

4. Which OpenSearch Dashboards view is the SRE day-to-day workspace for ad-hoc log exploration?

Visualize

Discover

Dev Tools

Stack Management

5. The Anomaly Detection plugin uses which algorithm?

Linear regression

k-means clustering

Random Cut Forest (unsupervised)

Static threshold over a moving average

OpenSearch Ingestion Service (OSIS)

Amazon OpenSearch Ingestion (OSIS) is the AWS-managed Data Prepper service. You upload a YAML pipeline definition, choose a capacity range in OpenSearch Compute Units (OCUs), and AWS runs the pipeline as a serverless service that auto-scales between your minimum and maximum OCU bounds.

A pipeline has four logical stages:

Source: where data enters — HTTP push, S3 + SQS, OpenTelemetry endpoints, Kafka, or existing OpenSearch indexes for migrations.
Buffer: holds events between stages; in-memory (default) or persistent for at-least-once durability.
Processors: transform events — grok parses unstructured log lines, date parses timestamps, mutate adds/removes fields, otel_trace shapes spans, service_map builds service graphs, conditional routing forks events.
Sink: writes to OpenSearch, OpenSearch Serverless, S3, or another pipeline.

Capacity is measured in OCUs. Stateless pipelines scale to 96 OCUs (384 with persistent buffering); stateful pipelines top out at 48 OCUs (192 with buffering). One OCU handles a few thousand events per second of typical log data.

Animation 3: Data Prepper / OSIS Pipeline Flow

Watch a single event traverse source → buffer → processor → sink with each stage activating in turn.

A worked example: ingest application logs from S3 (notified via SQS), parse with grok, ship to OpenSearch.

log-pipeline:
  source:
    s3:
      acknowledgments: true
      notification_type: sqs
      compression: gzip
      codec:
        newline:
      sqs:
        queue_url: "https://sqs.us-east-1.amazonaws.com/123456789012/log-events"
      aws:
        region: "us-east-1"
        sts_role_arn: "arn:aws:iam::123456789012:role/osis-pipeline-role"

  processor:
    - grok:
        match:
          message: ['%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{GREEDYDATA:msg}']
    - date:
        match:
          - key: timestamp
            patterns: ["ISO8601"]
        destination: "@timestamp"

  sink:
    - opensearch:
        hosts: ["https://search-prod-abc123.us-east-1.es.amazonaws.com"]
        index: "logs-app-%{yyyy.MM.dd}"
        aws:
          sts_role_arn: "arn:aws:iam::123456789012:role/osis-pipeline-role"
          region: "us-east-1"

The index pattern logs-app-%{yyyy.MM.dd} produces daily rolling indexes; OSIS creates them on demand.

Self-hosted Data Prepper

Data Prepper is the open-source upstream that powers OSIS. Run it yourself for on-premises sources, custom Java processors, or colocation with another tool. The YAML syntax is identical to OSIS, easing migration in either direction. A pipeline can chain into another via the pipeline source/sink — useful for fan-in deduplication followed by fan-out routing. Up to 10 sub-pipelines can be chained per file.

Firehose-to-OpenSearch Delivery

When logs already flow through Amazon Data Firehose, you can deliver them straight to OpenSearch without a separate pipeline. Firehose buffers records by size or time, optionally invokes a Lambda transformation, and bulk-indexes the result. Records that fail after retries are written to an S3 backup bucket for replay.

Trade-off vs. OSIS: Firehose is simpler but less expressive. Grok parsing requires a Lambda transformer; conditional routing is unavailable (one stream maps to one index pattern). For pass-through, Firehose wins; for parsing and fan-out, OSIS wins.

Need	Choose
Pass-through indexing of structured records	Firehose
Grok parsing of unstructured logs	OSIS or Data Prepper
Distributed-trace ingestion (OTel)	OSIS or Data Prepper
On-premises collector reachability	Self-hosted Data Prepper
At-least-once with persistent buffering	OSIS with persistent buffering
Single AWS-resident, simple, cheap path	Firehose

OpenSearch Dashboards

OpenSearch Dashboards is the Kibana fork. It runs as a Node.js application that talks to an OpenSearch domain and provides a browser-based UI for several core workflows:

Discover: the SRE day-to-day workspace. Pick an index pattern (e.g. logs-app-*), pick a time range, run queries in OpenSearch Query DSL or PPL. Shows raw documents, a histogram of event counts, and per-field breakdowns. Killer feature: time-range comparison — "what is traffic now" vs. "what was traffic at this time last week" in two clicks.
Visualize: line, bar, heatmap, geo-map, tag-cloud charts against bucket and metric aggregations.
Dashboards: composed from saved visualizations; teams pin them during incidents.
Dev Tools: REST playground for raw API calls (GET _cluster/health, PUT /_index_template/...).

For multi-tenant environments, fine-grained access control restricts users to specific index patterns, fields, or documents. Application teams typically own a dashboard tenant; platform teams own a global tenant for cross-cutting dashboards.

Alerting and Anomaly Detection

The Alerting plugin turns saved queries into scheduled monitors with:

Inputs: a query against indexes returning documents or aggregations.
Triggers: per-document, per-bucket, or query-level conditions.
Actions: Slack, Teams, email, SNS, webhook, or PagerDuty notifications.

Example: a monitor that fires when level: ERROR events from any single service exceed 50 in 5 minutes. A per-bucket trigger over a terms aggregation on service.name produces a separate alert per offending service.

The Anomaly Detection plugin layers ML on top using Random Cut Forest — an unsupervised ensemble for time-series anomalies. Detectors learn normal patterns, score new buckets, and emit anomaly grades that adapt to seasonality and drift. Production hierarchy: static thresholds for SLO violations (latency p99 > 500 ms, errors > 1%); anomaly alerts for novel failures; composite monitors combining multiple inputs to suppress flapping.

Trace Analytics and APM

OpenSearch supports distributed-trace analytics through OpenTelemetry ingestion. Spans flow from instrumented services (via OTel Collector or AWS Distro for OpenTelemetry) into an OSIS or Data Prepper pipeline. The pipeline runs otel_trace_raw to flatten spans, service_map_stateful to compute a service graph with edge metrics like latency and error rate, and writes to otel-v1-apm-span-* for raw spans and otel-v1-apm-service-map for graph aggregates.

Analogy: traces are flight-tracking radar. Each request is an aircraft; each span is a leg (taxi, climb, cruise, descent). The trace view shows one flight path; the service map is the airport-pair graph aggregating all flights.

For log-trace correlation, instrument services to inject trace_id and span_id into every log line. With logs and traces in the same domain, a Discover query for a trace ID surfaces both, giving full context in one tool.

Figure 12.2: OSIS Source-Buffer-Processor-Sink Pipeline

flowchart LR subgraph Sources["Source Plugins"] S3["S3 + SQS"] HTTP["HTTP Push"] OTEL["OTel Endpoint"] KAFKA["Kafka"] end Buffer["Buffer
(in-memory or
persistent)"] subgraph Processors["Processor Chain"] direction LR P1["grok
parse log lines"] --> P2["date
parse @timestamp"] --> P3["mutate
enrich fields"] --> P4["conditional
routing"] end subgraph Sinks["Sink Plugins"] OS["OpenSearch
Domain"] OSS["OpenSearch
Serverless"] S3OUT["S3 Archive"] PIPE["Another
Pipeline"] end Sources --> Buffer --> Processors --> Sinks

Figure 12.3: OpenTelemetry Trace Ingestion Flow

sequenceDiagram participant App as Instrumented App participant ADOT as ADOT Collector participant OSIS as OSIS Pipeline participant Raw as otel-v1-apm-span-* participant Map as otel-v1-apm-service-map participant Dash as Trace Analytics App->>ADOT: Emit spans (trace_id, span_id) ADOT->>OSIS: Batch OTLP export OSIS->>OSIS: Flatten spans (otel_trace_raw) OSIS->>OSIS: Compute edges (service_map_stateful) par Fan-out OSIS->>Raw: Index raw spans and OSIS->>Map: Index service-graph aggregates end Dash->>Raw: Query waterfall by trace_id Dash->>Map: Query service map

Key Points: Log Ingestion and Visualization

OSIS = managed Data Prepper, billed in OCUs, auto-scales between min/max bounds.
Pipelines are source → buffer → processor → sink; YAML is portable to self-hosted Data Prepper.
Choose Firehose for pass-through, OSIS for parsing and routing, self-hosted Data Prepper for on-premises or custom-code.
Dashboards: Discover for ad-hoc, Visualize/Dashboards for charts, Dev Tools for REST API.
Alerting: monitors with per-doc / per-bucket / query-level triggers and Slack/SNS/PagerDuty actions.
Anomaly Detection: Random Cut Forest, seasonal-aware, perfect for "every service has its own normal."
Trace Analytics: OTel → OSIS → raw spans + service map; correlate with logs via shared trace IDs.

Post-Reading Check — Log Ingestion and Visualization

1. What is OpenSearch Ingestion (OSIS)?

A self-hosted Logstash distribution

An auto-scaling, AWS-managed Data Prepper service

A Kinesis Data Streams replacement

A CloudWatch Logs subscription filter

2. What are the four logical stages of a Data Prepper pipeline?

producer, broker, consumer, sink

source, buffer, processor, sink

extract, transform, load, archive

map, reduce, shuffle, persist

3. When should you choose Firehose-to-OpenSearch over OSIS?

When you need grok parsing of unstructured logs

When you need conditional routing to multiple indexes

When you have already-structured records and want a simple managed delivery path

When you need on-premises ingestion

4. Which OpenSearch Dashboards view is the SRE day-to-day workspace for ad-hoc log exploration?

Visualize

Discover

Dev Tools

Stack Management

5. The Anomaly Detection plugin uses which algorithm?

Linear regression

k-means clustering

Random Cut Forest (unsupervised)

Static threshold over a moving average

Section 3: Cost and Tiering

Pre-Reading Check — Cost and Tiering

1. Which storage tier is read-only and backed by S3 plus a local SSD/memory cache?

Hot

UltraWarm

Cold

Frozen

2. What must you do before querying a cold-storage index?

Restore it from a snapshot to hot storage

Attach it back to UltraWarm

Re-index every document from S3

Cold queries run with no preparation

3. What does ISM stand for, and what does it automate?

Index Search Manager — query routing

Index State Management — index lifecycle policies

Internal Sharding Method — primary shard placement

Identity Service Manager — access control

4. Why is force_merge to one segment per shard recommended before warm migration?

It encrypts the segment files before upload

It is required by IAM

It removes deleted documents and reduces S3 fetch fan-out on cold cache

It rebuilds the inverted index from scratch

5. What is the main trade-off of OpenSearch Serverless versus a provisioned domain with ISM?

Serverless costs more for spiky workloads

Serverless eliminates cluster sizing but does not support ISM tiering

Serverless requires you to choose hot/warm/cold tiers manually

Serverless cannot run trace analytics

UltraWarm and Cold Storage Tiers

Recent logs need fast queries; quarter-old logs must exist for compliance but rarely get touched. OpenSearch addresses this with three storage tiers and an automation engine.

Hot tier: instance-attached EBS (or NVMe), full IOPS and memory caching. Fastest, sub-second queries. About $0.169/GB-month for EBS plus data-node cost.
UltraWarm: S3-backed with an LRU cache on local SSD and in memory. Migrated indexes become read-only, are force_merged to one segment per shard, and segment files are uploaded to S3. Queries pull segments from S3 into cache on demand. UltraWarm storage is $0.024/GB-month — about 85% cheaper than hot — plus warm node cost.
Cold storage: detaches indexes entirely; only metadata stays in the cluster. You pay S3 standard rates (~$0.0125/GB-month). To query a cold index you must attach it back to UltraWarm first. Cold is for once-a-year data: security audits, regulatory archives, long-tail debugging.

Tier	Use Case	Storage	Compute	Query Performance
Hot	Recent (0-7 days)	~$0.169/GB-mo	Full data-node instances	Sub-second
UltraWarm	Historical (7-90 days)	$0.024/GB-mo	$0.238-$2.68/hr per warm node	Interactive (S3 + cache)
Cold	Archive (>90 days)	~$0.0125/GB-mo (S3)	Pay-per-attach	Slow first query, then UltraWarm-like

Worked cost example: 100 TB retained 365 days with 7 days hot, 83 days UltraWarm, 275 days cold.

Hot: 100 TB × (7/365) × $0.169/GB-mo × 12 mo ≈ $3,890/yr (plus instances)
UltraWarm: 100 TB × (83/365) × $0.024/GB-mo × 12 mo ≈ $6,540/yr (plus warm nodes)
Cold: 100 TB × (275/365) × $0.0125/GB-mo × 12 mo ≈ $11,300/yr
Total ≈ $21,700/yr versus $202,800/yr for naive all-hot — about a 9x reduction.

Index State Management (ISM)

Manually moving indexes between tiers does not scale. Index State Management (ISM) is OpenSearch's built-in policy engine. An ISM policy describes a finite-state machine of states (hot, warm, cold, delete), actions performed on state entry, and conditions that trigger transitions.

Animation 4: ISM Hot → UltraWarm → Cold → Delete Lifecycle

A document follows the policy: rolled over after 1d/50GB/50M docs, migrated to warm at 7d, cold at 30d, deleted at 90d.

A complete log-retention policy:

PUT _plugins/_ism/policies/log-lifecycle
{
  "policy": {
    "description": "Log retention: hot->UltraWarm->cold->delete",
    "default_state": "hot",
    "states": [
      {
        "name": "hot",
        "actions": [
          { "rollover": {
              "min_size": "50GB",
              "min_index_age": "1d",
              "min_doc_count": 50000000
          }}
        ],
        "transitions": [
          { "state_name": "warm",
            "conditions": { "min_index_age": "7d" } }
        ]
      },
      {
        "name": "warm",
        "actions": [
          { "warm_migration": {} },
          { "force_merge": { "max_num_segments": 1 } },
          { "replica_count": { "number_of_replicas": 1 } }
        ],
        "transitions": [
          { "state_name": "cold",
            "conditions": { "min_index_age": "30d" } }
        ]
      },
      {
        "name": "cold",
        "actions": [
          { "cold_migration": { "timestamp_field": "timestamp" } }
        ],
        "transitions": [
          { "state_name": "delete",
            "conditions": { "min_index_age": "90d" } }
        ]
      },
      {
        "name": "delete",
        "actions": [ { "cold_delete": {} } ]
      }
    ],
    "ism_template": [
      { "index_patterns": ["logs-*"], "priority": 100 }
    ]
  }
}

The ism_template block auto-attaches the policy to any new index matching logs-*. Combined with rolling daily indexes from Firehose or OSIS, the lifecycle runs without operator intervention.

The force_merge before warm migration is a critical optimization. Lucene segments accumulate as documents are written; merging to one segment per shard removes deleted documents and consolidates layout, shrinking storage and speeding up cold-cache queries on UltraWarm. Skipping it leaves hundreds of small segments per shard, each requiring a separate S3 fetch.

Serverless OpenSearch Collections

For workloads where you do not want to size a cluster, Amazon OpenSearch Serverless offers collections — managed, auto-scaling endpoints typed as time-series (logs), search, or vector search. AWS provisions and scales OCUs behind the scenes and stores data on S3 by default.

Key differences from provisioned:

No node sizing. You set min/max OCU bounds; AWS scales between them.
Storage on S3 by default. No UltraWarm or cold tier — storage is already S3 — but you also lose explicit hot/warm control.
Pricing. Per OCU-hour for indexing and search separately, plus storage. Cheaper for spiky/small workloads, often costlier than a well-sized provisioned domain at steady high volume.
Feature subset. Trace analytics works; ISM is unsupported because the storage model differs.
Sink configuration. OSIS pipelines targeting Serverless need serverless: true plus index_type: management_disabled.

Choose Serverless for variable/bursty volumes, new workloads, or vector search. Stay provisioned for steady high-volume ingest where ISM tiering is the cost win, plugin-heavy workloads, or strict latency targets.

Figure 12.4: ISM Lifecycle State Machine

stateDiagram-v2 [*] --> hot: Index created
(ism_template auto-attach) hot --> hot: rollover
(50GB / 1d / 50M docs) hot --> warm: min_index_age >= 7d
warm_migration,
force_merge to 1 segment,
replica_count = 1 warm --> cold: min_index_age >= 30d
cold_migration
(detach to S3) cold --> delete: min_index_age >= 90d delete --> [*]: cold_delete
(remove metadata)

Key Points: Cost and Tiering

Hot = fast and expensive (~$0.169/GB-mo + instances).
UltraWarm = S3 + LRU cache, read-only, ~85% cheaper ($0.024/GB-mo).
Cold = detached S3 archive (~$0.0125/GB-mo); attach back to UltraWarm before query.
ISM automates the lifecycle via state machine of actions + transitions; ism_template auto-attaches to matching indexes.
force_merge to 1 segment before warm migration is critical for cold-cache query latency on UltraWarm.
Serverless collections remove cluster sizing but lose ISM tiering; pick provisioned for steady high-volume retention economics.
Tiering routinely saves 80-90% on log retention bills.

Post-Reading Check — Cost and Tiering