DAG Node Equivalence Checker

An OCaml prototype that determines whether two nodes in an append-only DAG are equivalent — meaning the subgraphs reachable from them (both forward via successors and backward via predecessors) are isomorphic, preserving node values.

Node values are polymorphic — the DAG is parameterised over a VALUE module via a functor.

Prerequisites

OCaml >= 5.0
dune >= 3.0
Graphviz (for PNG rendering; brew install graphviz on macOS)

Quick start

dune build
dune exec test/test_dag.exe       # correctness tests
dune exec bench/bench_dag.exe     # benchmarks

How it works

Data model

The DAG is append-only. Each node has:

A value of type V.t (chosen by the caller via the VALUE module).
A list of successors (edges to existing, lower-id nodes), fixed at creation.
A list of predecessors (accumulated as new nodes point to it).

Two Merkle hashes per node

Every node carries two hashes:

Hash	Captures	Computed when
`hash_fwd`	Value + forward-reachable subgraph (via successors)	Once, at creation
`hash_bwd`	Value + backward-reachable subgraph (via predecessors)	Recomputed on every insertion that affects the node

Both are computed the same way — H(V.hash(value), sort(neighbor_hashes)) — just over different neighbor sets (successors for forward, predecessors for backward).

Equivalence query

equiv(u, v) = (hash_fwd(u) = hash_fwd(v)) && (hash_bwd(u) = hash_bwd(v))

This is O(1).

Deduplication

A node is only inserted if no existing node has the same (hash_fwd, hash_bwd) pair with matching value and successor ids. This means structurally identical nodes are never duplicated, which eliminates the diamond-vs-tree ambiguity that would otherwise make simple Merkle hashing unsound.

Backward hash propagation

When a new node v is added with edges v -> w1, ..., wk:

v becomes a new predecessor of each wi.
hash_bwd(wi) must be recomputed (its predecessor set changed).
Changing hash_bwd(wi) affects any node that has wi as a predecessor — those are wi's successors (lower-id nodes that wi points to).
Propagation follows successor edges (forward in the DAG, toward leaves).
Nodes are processed in descending id order, which is a valid topological order (predecessors always have higher ids, so they're processed first).

   v  (new)            Propagation direction
   |                          |
   v                          v
  w1 ──> x ──> y ──> z      (toward leaves)

Each node is visited at most once per insertion. When we process node n, all its predecessors (higher ids) are already finalized, so the hash is computed from stable inputs. Propagation stops early if recomputation yields the same hash.

Hash function

63-bit FNV-1a. The neighbor hash list is sorted (multiset semantics) and the list length is mixed in so that [], [0], and [0; 0] all hash differently. Node values are hashed via the user-supplied V.hash function (Hashtbl.hash works for any type with structural equality).

Project layout

lib/
  hash.mli / hash.ml       Hash combiner (Hash.hash_node)
  dag.mli  / dag.ml        Core DAG functor, VALUE sig, Int_value default
  dag_viz.mli / dag_viz.ml Text table and Graphviz output (also a functor)
test/
  test_dag.ml              Correctness tests (assert-based)
bench/
  bench_dag.ml             Insertion throughput, propagation cost, query latency

API overview

The VALUE signature

To use the DAG with a custom value type, provide a module matching:

module type VALUE = sig
  type t
  val hash : t -> int       (* Hashtbl.hash works for structural equality *)
  val equal : t -> t -> bool
  val to_string : t -> string
end

Dag.Int_value is provided as a convenience for integer-valued DAGs.

Instantiation

open Subgraph_iso

(* Integer DAG — using the built-in Int_value *)
module G = Dag.Make (Dag.Int_value)
module Viz = Dag_viz.Make (G)

(* Custom value type *)
module My_value = struct
  type t = { label : string; weight : float }
  let hash v = Hashtbl.hash v
  let equal a b = a.label = b.label && a.weight = b.weight
  let to_string v = Printf.sprintf "%s(%.1f)" v.label v.weight
end

module My_dag = Dag.Make (My_value)
module My_viz = Dag_viz.Make (My_dag)

Building a DAG

let g = G.create () in

(* Leaves first, then nodes that point to them *)
let c = G.add_node g ~value:3 ~successors:[] in
let b = G.add_node g ~value:2 ~successors:[c] in
let a = G.add_node g ~value:1 ~successors:[b] in
(* a -> b -> c *)

Querying equivalence

(* Build a second identical chain *)
let f = G.add_node g ~value:3 ~successors:[] in
let e = G.add_node g ~value:2 ~successors:[f] in
let d = G.add_node g ~value:1 ~successors:[e] in

(* Corresponding nodes are equivalent *)
assert (G.equiv g a d);   (* roots *)
assert (G.equiv g b e);   (* middles *)
assert (G.equiv g c f);   (* leaves *)

Inspecting a node

G.value g a              (* 1 *)
G.successors g a         (* [b] *)
G.predecessors g a       (* [] *)
G.hash_fwd g a           (* some int *)
G.hash_bwd g a           (* some int *)

Visualisation

(* Text table to stdout *)
print_string (Viz.to_string g);

(* Graphviz DOT source *)
print_string (Viz.to_dot g);

(* Render directly to PNG (requires graphviz) *)
Viz.render_png g "my_dag.png";

Performance characteristics

Measured on Apple Silicon, OCaml 5.4.0, native code:

Operation	Complexity	10k nodes
`add_node` (wide fan-out)	O(1) amortised	9 ms total
`add_node` (chain, worst case)	O(n) per insert, O(n^2) total	13 s total
Single-insert propagation (chain)	O(n)	2.7 ms
`equiv` query	O(1)	~100 ns

The quadratic chain case is inherent: each insertion into a chain propagates backward hashes through all existing nodes. For the target scale (thousands of nodes, mixed topologies), this is acceptable.

Extending

Wider hashes: swap Hash.hash_node to return int * int (128-bit) for near-zero collision probability.
Lazy propagation: defer backward hash recomputation until equiv is queried (trade insertion speed for query latency).
Persistence: the DAG is currently mutable. Wrap it in a functional interface if you need snapshot/rollback.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
bench		bench
lib		lib
test		test
.gitignore		.gitignore
README.md		README.md
dune-project		dune-project

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DAG Node Equivalence Checker

Prerequisites

Quick start

How it works

Data model

Two Merkle hashes per node

Equivalence query

Deduplication

Backward hash propagation

Hash function

Project layout

API overview

The VALUE signature

Instantiation

Building a DAG

Querying equivalence

Inspecting a node

Visualisation

Performance characteristics

Extending

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DAG Node Equivalence Checker

Prerequisites

Quick start

How it works

Data model

Two Merkle hashes per node

Equivalence query

Deduplication

Backward hash propagation

Hash function

Project layout

API overview

The VALUE signature

Instantiation

Building a DAG

Querying equivalence

Inspecting a node

Visualisation

Performance characteristics

Extending

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages