Omnigraph Atlas Omnigraph's documentation, bound to its Rust workspace
79 documents

How a write becomes durable

Follow a mutation from the query executor through in-memory staging, the single manifest-publish CAS fence, and crash recovery — the heart of Omnigraph's atomicity guarantee.

L2 — Multi-dataset coordination via __manifest

OmniGraph is not a single Lance dataset; it is a graph of datasets coordinated through one append-only manifest table.

  • Manifest table: __manifest/ Lance dataset.
  • Layout:
    • nodes/{fnv1a64-hex(type_name)} — one Lance dataset per node type
    • edges/{fnv1a64-hex(edge_type_name)} — one Lance dataset per edge type
    • __manifest/ — the catalog of all sub-tables and their published versions
    • _graph_commits.lance / _graph_commit_actors.lance — the commit graph and its actor map
    • (legacy _graph_runs.lance / _graph_run_actors.lance from pre-v0.4.0 graphs are inert; the run state machine was removed. The internal schema migration sweeps stale __run__* branches on first write-open; the inert dataset bytes themselves remain until a prefix-delete storage primitive lands)
  • Manifest row schema (object_id, object_type, location, metadata, base_objects, table_key, table_version, table_branch, row_count):
    • object_typetable | table_version | table_tombstone
    • table_keynode:<TypeName> | edge:<EdgeName>
    • table_branch is null for the main lineage and the branch name otherwise
  • Snapshot reconstruction: latest visible table_version per (table_key, table_branch) minus tombstones — rows where object_type = table_tombstone, whose own table_version (acting as the tombstone version) is >= the entry's table_version.
  • Atomic publish: multi-dataset commits publish so that a single write to __manifest flips all the new sub-table versions visible at once.
  • Row-level CAS on the merge-insert join key: object_id carries an unenforced-primary-key annotation so Lance's bloom-filter conflict resolver rejects two concurrent commits that land the same object_id row. Without this annotation, Lance's transparent rebase would admit silent duplicates from racing publishers.
  • Optimistic concurrency control on publish: a publish asserts the manifest's current latest non-tombstoned version for each touched table is exactly what the caller observed; mismatches surface as an ExpectedVersionMismatch manifest conflict naming the table and the expected/actual versions. Concurrent advances surface as a conflict rather than being silently rebased through.