Skip to content

Blueprint vs. Run

One of Datum's core invariants: building a graph never starts execution.

The invariant

Calling Source::from_iter, .map(...), .filter(...), .via(...), and .to(sink) are pure data-structure operations. They assemble an immutable description of the computation — a blueprint — without spawning threads, allocating per-element buffers, or advancing any iterator.

rust
use datum::{Sink, Source};

// Building the Source chain is pure data structure work.
// No thread is spawned, no iterator is advanced.
let _blueprint = Source::from_iter(0_u64..1_000_000)
    .map(|x| x + 1)
    .filter(|x| x % 2 == 0)
    .to(Sink::ignore());
// _blueprint holds a closed graph; execution begins only at .run()

The one-million-element range above is described but not iterated. The _blueprint value holds a closed RunnableGraph — all the wiring is done, but nothing runs until .run().

Reuse

Because blueprints are immutable, the same RunnableGraph can be materialized multiple times. Each call to .run() starts a completely independent stream execution:

rust
use datum::{Keep, Sink, Source};

// `to_mat` builds a RunnableGraph — no execution yet.
let blueprint = Source::from_iter(1_u64..=5)
    .map(|x| x * x)
    .to_mat(Sink::collect(), Keep::right);

// Each call to `run()` starts a completely independent execution.
let run1: Vec<u64> = blueprint.run().unwrap().wait().unwrap();
let run2: Vec<u64> = blueprint.run().unwrap().wait().unwrap();

run1 and run2 are independent computations. They do not share state. The blueprint is not consumed — you can call .run() again after both complete.

Why this matters

The blueprint/run separation:

  • Enables testing — you can build a graph in one place and run it in another without any ordering constraint on when operators are constructed.
  • Enables reuse — a single well-tested blueprint can be used many times (e.g., a standard processing pipeline instantiated per incoming request).
  • Makes construction free of side effects — no resource is acquired, no timer started, no thread claimed until .run() is called. This makes blueprints safe to build in any order or thread.

The rules

  1. Operator methods (.map, .filter, .via, .to, .to_mat, etc.) are pure and return new blueprints. They never execute.
  2. The run methods (run_with, .run(), GraphDsl::run_*) are the only points of execution.
  3. A RunnableGraph (the result of .to_mat(...)) can be run multiple times.
  4. A partially-constructed chain (a Source or Flow not yet closed with a Sink) cannot be run — the type system prevents it.

Akka Streams parity

This invariant mirrors Akka Streams exactly. In Akka, RunnableGraph.run(materializer) is the execution boundary. In Datum, RunnableGraph.run() (the materializer is implicit to the graph) fills the same role. The Materializer type in Datum is an alias for Runtime — the same object manages both the thread pool and the act of materializing graphs.