docs/user/reference/constants.md

A cheat-sheet table of named constants and tunables with values and area. Layout/ manifest names (MANIFEST_DIR=__manifest, _graph_commits.lance, the removed-legacy _graph_runs.lance/__run__, __schema_apply_lock__), publish internals (PUBLISHER_RETRY_BUDGET=5, INTERNAL_MANIFEST_SCHEMA_VERSION=3, MERGE_STAGE_BATCH_ROWS=8192), maintenance concurrency, runtime caches (graph index cache 8 LRU, Lance memory pool 1 GB), traversal tuning (OMNIGRAPH_EXPAND_INDEXED_MAX_FRONTIER=1024, MAX_HOPS=6, CSR_BUILD_FACTOR=1.5), HTTP body limits (1 MB / 32 MB), and embedding defaults. Closes with a prose explanation of the Expand dispatch cost model — indexed per-hop BTREE vs whole-graph CSR chosen on frontier/|E|/source-count/hops plus index coverage. Read when you need a specific constant's value, an environment-variable default, or the precise Expand traversal dispatch tuning knobs.

Constants & Tunables (cheat sheet)

Name	Value	Area
`MANIFEST_DIR`	`__manifest`	manifest layout
Commit graph dir	`_graph_commits.lance`	commit graph
Run registry dir (legacy, removed)	`_graph_runs.lance`	inert post-v0.4.0; bytes remain until a prefix-delete primitive lands
Run branch prefix (legacy, removed)	`__run__`	swept off `__manifest` by the internal schema migration; no longer a reserved name
Schema apply lock	`__schema_apply_lock__`	schema apply
Manifest publisher retry budget	`PUBLISHER_RETRY_BUDGET = 5`	manifest publish
Internal manifest schema version	`INTERNAL_MANIFEST_SCHEMA_VERSION = 3`	manifest migrations
Merge stage batch	`MERGE_STAGE_BATCH_ROWS = 8192`	merge execution
Maintenance concurrency	`OMNIGRAPH_MAINTENANCE_CONCURRENCY=8`	optimize/cleanup
Lance blob compaction support	`LANCE_SUPPORTS_BLOB_COMPACTION = false`	optimize
Graph index cache size	`8` (LRU)	runtime cache
Expand indexed-path frontier ceiling	`OMNIGRAPH_EXPAND_INDEXED_MAX_FRONTIER=1024`	traversal
Expand indexed-path hop ceiling	`OMNIGRAPH_EXPAND_INDEXED_MAX_HOPS=6`	traversal
Expand CSR-build cost factor	`CSR_BUILD_FACTOR = 1.5`	traversal
Expand mode override	`OMNIGRAPH_TRAVERSAL_MODE` (`indexed`\|`csr`; unset = cost-based auto)	traversal
Default body limit	`1 MB`	HTTP server
Ingest body limit	`32 MB`	HTTP server
Default embed provider/model	`openai-compatible` / `openai/text-embedding-3-large`	engine embedding
OpenAI-direct embed model	`text-embedding-3-large`	engine embedding
Gemini-direct embed model	`gemini-embedding-2`	engine embedding
Embed deadline	`OMNIGRAPH_EMBED_DEADLINE_MS=60000`	engine embedding
Embed timeout	`OMNIGRAPH_EMBED_TIMEOUT_MS=30000`	engine embedding
Embed retries	`OMNIGRAPH_EMBED_RETRY_ATTEMPTS=4`	engine embedding
Embed retry backoff	`OMNIGRAPH_EMBED_RETRY_BACKOFF_MS=200`	engine embedding
LANCE memory pool default	`1 GB` (raised in v0.3.0)	runtime

Expand traversal dispatch. With OMNIGRAPH_TRAVERSAL_MODE unset, the engine chooses the indexed (per-hop BTREE) vs CSR (whole-graph in-memory) path with a cost model over cheap manifest counts (frontier size, |E|, source-vertex count, hops) plus the index-coverage signal: the indexed path is preferred when its frontier-relative work beats building the CSR (≈ when hops × frontier is a small fraction of the source-vertex set), and CSR is preferred for dense/deep traversals or when the BTREE coverage is degraded and a full scan would be paid per hop. The two ceilings bound the initial dispatch frontier/hops (beyond them CSR is always used); they are not a hard per-hop bound — the cost model estimates total indexed work as ~hops × frontier × fanout, so dense fan-out is priced toward CSR rather than capped mid-traversal. The override flag forces a path (the auto result is identical either way; only the path differs).