Skip to content

Datum🪼 Just "Stream" anything

Rust stream-processing library mirroring Akka/Pekko Streams Typed, built on Ractor actors.

What Datum is ​

Datum is a Rust stream-processing library whose API mirrors Akka/Pekko Streams Typed. The same Source → Flow → Sink vocabulary, the same GraphDSL junction model, the same blueprint-vs-materialization separation — written in Rust, grounded in Ractor actors, and running on Tokio.

If you have worked with Akka Streams, the Datum API will feel familiar. If you haven't, the Concepts section explains the model from first principles.

What it is not ​

  • Not a Reactive Streams bridge. Datum does not implement Publisher / Subscriber interop. The idiomatic Rust integration surfaces are Tokio futures and Ractor actors.
  • Not a universal parity clone. Datum targets the Akka Streams API shape and behavior, not every internal implementation detail. Where Rust ownership and naming conventions differ from Scala, Datum follows Rust (e.g. filter_map instead of Akka's collect).

Current status — v0.2.0 ​

v0.2.0 ships:

  • Linear DSL — Source, Flow, Sink, RunnableGraph, Materializer with the full operator set: element ops, async ops, bounds, aggregation, error handling, supervision, and restart/retry.
  • Graph layer — GraphDSL, typed ports (Inlet/Outlet), all core junctions (Broadcast, Balance, Merge, Zip, MergePreferred, MergePrioritized), and a fused executor with a typed-linear fast path (16–46× warmed Akka on sync chains).
  • Actor interop — ActorFlow::ask, ActorSource, ActorSink, and Ractor-backed publish/subscribe.
  • Context-aware streams — SourceWithContext, FlowWithContext.
  • Dynamic streams — UniqueKillSwitch, SharedKillSwitch, MergeHub, BroadcastHub, PartitionHub.
  • Streaming I/O — file I/O, TCP, framing, compression.
  • Substreams — group_by, split_when, split_after, flat_map_concat, flat_map_merge.

See the benchmark result tables and the roadmap for honest per-path performance numbers and the forward plan.

Performance posture ​

Datum is benchmarked head-to-head against warmed Akka/Pekko Streams across four areas — Source/Flow, materialization, graph/junctions, and actor ask. The harness adds a CPU column deliberately: some wins come from busy-spinning while Akka parks — a real cost that wall-clock numbers hide.

Highlights from v0.2.0:

  • Sync operator chains: 5–8× faster than warmed Akka (typed-linear fused path, no boxing).
  • Graph typed-linear path: 16–46× faster (monomorphized, allocation-free hot path).
  • sink_terminal_head: 56× (synchronous micro-source inline drain).
  • Ordered actor ask: roughly parity at p1 (noisy run-to-run, ~0.8–1.3×), ~2–3× at parallelism 2–4, ~parity at p16 — at ~2× CPU. See roadmap/benchmarks/actor-ask.md.

Not every path wins. The erased-executor fallback (graphs the typed path can't specialize) runs at roughly 0.5–0.7× Akka. See roadmap/benchmarks/ for the full per-scenario table.