Documentation
Version-controlled schema migrations for Elasticsearch and OpenSearch — Flyway for search engines.
Getting Started
Installation
npm install -g scaledsearch
This installs two equivalent commands — use whichever you prefer:
scaledsearch migrate apply # full name
ss migrate apply # shorthand
Requirements: Node.js >= 18. No cluster connection is needed for status, diff, validate, or apply --dry-run — those work fully offline.
Quick start — new project
# 1. Initialize ScaledSearch in your project
scaledsearch migrate init
# 2. Create a migration
scaledsearch migrate create "add-products-index"
# 3. Edit the generated YAML
# migrations/V001__add-products-index.yaml
# 4. Preview the changes (offline, no cluster needed)
scaledsearch migrate apply --dry-run
# 5. Apply them
scaledsearch migrate apply
# 6. Check what's applied vs pending
scaledsearch migrate status
init creates a .scaledsearch/config.yaml and a migrations/ directory.
Quick start — existing cluster
Already have indices in production? Capture them as a baseline first, then version forward from there:
# 1. Initialize
scaledsearch migrate init
# 2. Import current cluster state as V000
scaledsearch migrate import
# 3. Start versioning from here
scaledsearch migrate create "add-vector-field"
scaledsearch migrate apply
import snapshots indices, mappings, settings, aliases, index templates, and ingest pipelines into V000__baseline.yaml and marks it as already applied so it never re-executes. System-owned objects are excluded automatically — see the importing guide.
Commands
All commands are subcommands of scaledsearch migrate (or ss migrate).
| Command | Description |
|---|---|
migrate init | Initialize ScaledSearch in the current directory |
migrate create <name> | Create a new versioned migration file |
migrate status | Show applied vs pending migrations |
migrate apply | Apply pending migrations to the cluster |
migrate diff | Show detailed pending changes |
migrate validate | Validate files and simulate their end-state (offline) |
migrate import | Import an existing cluster as a V000 baseline |
migrate rollback | Undo the last applied migration |
Commands that work offline (no cluster connection): init, create, status, diff, validate, and apply --dry-run.
migrate apply
scaledsearch migrate apply # apply all pending
scaledsearch migrate apply --dry-run # preview only, offline
scaledsearch migrate apply --target V003 # apply up to and including V003
--dry-runprints what would run without touching the cluster, and honors--target.--target <version>stops after the given version; an unknown or already-applied target produces a friendly error.- A migration that fails is not recorded as applied, so re-running resumes correctly.
migrate rollback
Undoes the last applied migration by running its rollback: section. Refuses to run when nothing is applied, or when the last migration has no rollback: section defined.
scaledsearch migrate rollback
Operation Types
Every entry under a migration's operations: (or rollback:) list has a type. ScaledSearch supports 15 operation types across 6 categories.
| Category | Operations |
|---|---|
| Index | create_index, delete_index, close_index, open_index |
| Schema | put_mapping, put_settings |
| Data | reindex (async with progress) |
| Alias | add_alias, remove_alias, swap_alias |
| Template | put_template, delete_template |
| Pipeline | put_pipeline, delete_pipeline |
| Generic | api_call (any REST API) |
Index & Schema
- type: create_index
index: products
settings: { number_of_shards: 2, number_of_replicas: 1 }
mappings:
properties:
title: { type: text }
- type: put_mapping
index: products
body:
properties:
in_stock: { type: boolean }
Data — reindex async
Reindex runs asynchronously with real-time progress tracking. No configuration needed. If the CLI disconnects, the reindex keeps running on the cluster.
- type: reindex
source: products_v1
dest: products_v2
Applying V003 Migrate to products_v2... 45% (4,500,000/10,000,000 docs) done (42m)
Alias
# Add an alias
- type: add_alias
index: products_v2
alias: products
# Atomic swap (remove + add in a single cluster call)
- type: swap_alias
alias: products
from: products_v1
to: products_v2
See the zero-downtime guide for the full alias-swap pattern.
Generic — api_call
An escape hatch for any Elasticsearch/OpenSearch REST API not covered by a dedicated operation:
- type: api_call
method: PUT
path: /_cluster/settings
body:
persistent:
cluster.routing.allocation.disk.watermark.high: "90%"
Works with any API: ILM policies, cluster settings, component templates, and more.
Migration File Format
Migrations are YAML files in your migrations/ directory, named with an auto-incrementing version prefix:
migrations/
├── V000__baseline.yaml # optional, created by `import`
├── V001__add-products.yaml
└── V002__add-vector-field.yaml
Structure
description: "Create products index with vector search"
engine: elasticsearch
target_version: ">=8.0"
operations:
- type: create_index
index: products
mappings:
properties:
embedding: { type: dense_vector, dims: 768 }
rollback:
- type: delete_index
index: products
| Field | Required | Description |
|---|---|---|
description | recommended | Human-readable summary of the migration |
engine | optional | elasticsearch or opensearch |
target_version | optional | Version constraint (e.g. ">=8.0") checked at apply time |
operations | yes | Ordered list of operations to apply |
rollback | optional | Ordered list of operations to undo this migration |
Versioning & checksums
Files are applied in ascending version order (V001, V002, …). When a migration is applied, ScaledSearch records a checksum of the file in the history index. On every subsequent run it re-checks that checksum: if an already-applied file has been modified, the run fails loudly rather than silently diverging from what was actually applied to the cluster.
Configuration
migrate init writes .scaledsearch/config.yaml in your project. Commit it to git.
# .scaledsearch/config.yaml
engine: elasticsearch
connection:
host: http://localhost:9200
migrations:
location: ./migrations
history:
index: .scaledsearch_history
| Key | Description |
|---|---|
engine | elasticsearch or opensearch |
connection.host | Cluster URL |
connection.auth | Optional auth block — see below |
migrations.location | Directory holding migration files |
history.index | Internal index that records applied migrations |
History index
ScaledSearch tracks what has been applied in an internal index (default .scaledsearch_history). init derives a per-project history index name, so multiple projects pointing at the same cluster keep separate histories. It stores, per migration: the version, a checksum, and the applied timestamp. Failed migrations are not recorded as applied.
Authentication
# Basic auth
connection:
host: https://my-cluster:9200
auth:
type: basic
username: elastic
password: changeme
# API key
connection:
auth:
type: apikey
apiKey: your-base64-api-key
Engines
| Engine | Versions | Status |
|---|---|---|
| Elasticsearch | 7.x, 8.x, 9.x | ✓ Verified |
| OpenSearch | 1.x, 2.x, 3.x | ✓ Verified |
| Solr | 8.x, 9.x | Coming soon |
Tested against: ES 7.17, ES 8.17, ES 9.0, OpenSearch 2.19, OpenSearch 3.0. Elasticsearch and OpenSearch both use the official @elastic/elasticsearch client, which is wire-compatible across ES 7–9 and OpenSearch.
Version constraints
A migration may declare a target_version constraint that is checked at apply time. If the connected cluster doesn't satisfy it, the migration won't be applied — useful for version-gated features like dense_vector.
target_version: ">=8.0"
Guide — Zero-Downtime Migrations
Changing a mapping in place is often impossible — many mapping changes require a new index. The standard zero-downtime pattern is: build a new index, reindex into it, then atomically swap an alias so reads/writes never point at a half-built index.
description: "Migrate to products_v2 with zero downtime"
operations:
# 1. Create the new index with the updated mapping
- type: create_index
index: products_v2
mappings:
properties:
embedding: { type: dense_vector, dims: 768 }
# 2. Reindex existing data (runs async with progress)
- type: reindex
source: products_v1
dest: products_v2
# 3. Atomically point the `products` alias at the new index
- type: swap_alias
alias: products
from: products_v1
to: products_v2
# Safe rollback: just swap the alias back. Both indices still exist.
rollback:
- type: swap_alias
alias: products
from: products_v2
to: products_v1
Why this is safe
swap_aliasis atomic — it removes the old alias target and adds the new one in a single cluster call, so there is no moment whereproductsresolves to nothing.- Rollback is instant and lossless — because the old index is left in place, the rollback is just the reverse swap. No data is deleted by the migration itself.
- Reads/writes use the alias, never the concrete index name, so clients are unaffected by the swap.
Guide — Importing an Existing Cluster
If you already have indices in production, you don't have to recreate them as migrations by hand. migrate import snapshots the live cluster into a baseline migration and marks it as already applied — so you can start version-controlling from where you are today.
scaledsearch migrate init
scaledsearch migrate import
This writes migrations/V000__baseline.yaml and records it in the history index as applied (so apply never tries to re-run it).
What gets captured
- Indices, with their mappings and settings
- Aliases, including alias options
- Closed-index state (closed indices are captured as closed)
- Index templates and ingest pipelines
What gets excluded
import deliberately skips engine-owned objects so your baseline is your schema, not the cluster's internal plumbing:
- Leading-dot system indices/templates — universally system-owned in ES and OpenSearch
- Elasticsearch built-ins — APM, Fleet, ML, monitoring, ILM/SLM history, watcher, connectors, behavioral analytics, and
@template/@pipelineconvention names - OpenSearch plugin state —
.opensearch-*,.opendistro-*,.plugins-*,top_queries-*,.tasks
import refuses to overwrite an existing V000__baseline.yaml.
Guide — Validating Offline
migrate validate does two things, entirely offline (no cluster connection):
- Checks file integrity — that every migration file parses, has the required fields, and uses known operation types.
- Simulates the end-state — it replays your migrations in order against an in-memory model of the cluster, catching ordering and reference problems before you touch a real cluster.
What the simulator catches
- An operation that targets an index which won't exist yet at that point in the sequence
- A
reindexwhose destination is never created - Wildcard targets that don't resolve to anything in the simulated state
- An alias swap referencing an index that was already deleted
validate is ideal in CI: fast, cluster-free confidence that a pull request's migrations are internally consistent before they're ever applied.
scaledsearch migrate validate # is the whole set consistent?
scaledsearch migrate apply --dry-run # what would the next apply do?