In @turing-machine-js/machine I built something I’d wanted for a long time: a state machine can be serialized to a Mermaid diagram and rebuilt. toGraph turns a machine into a graph, toMermaid into diagram text; fromMermaid and fromGraph walk the path back. Diagram and machine became interchangeable: you can draw one, edit it, save it — and restore a working machine.
And right away I wanted a regression test: run a machine through the round-trip and check that what comes out is the same machine. That word “same” is where it all started.
Byte-for-byte? No: node ids are runtime instance ids, and fromGraph assigns them fresh. The same structure? Sometimes yes, sometimes no — depending on how the composition is built. Behaves the same? That’s a different check entirely, one that never touches the graph.
It turned out “the same” isn’t one question but three, and each has its own API to answer it. These three answers I added to the library over years, one at a time, and for a long time never spelled out to myself that they’re three separate boundaries, not one stretchy “equals”. Object, graph, behaviour — three guises of one machine, and on a flat machine they all give the same answer. Composition is where the answers diverge. What follows: what the engine ships for this, where the boundaries run, and how v7 packed two guises of the three into graph form, leaving the third to a separate function, equivalentOn.
What does the engine actually ship?
Before pulling the three guises apart, an inventory.
There are three layers in the engine that the question “is it the same machine?” lands on:
Runtime objects. State is a node in the machine’s graph. It has a name, transitions (for each possible tape symbol — the next state, what to write, where to move), and identity: two different States are two different objects, no matter how similar their transition tables look.
The graph as serialization. State.toGraph(state, tapeBlock) walks the reachable states and returns a Graph — plain data: nodes with names and transitions, indexed by numeric ids. State.fromGraph(graph) takes that data back and returns {start, tapeBlock, states}. On top of the graph there’s toMermaid(graph) and fromMermaid(text), a mapping into Mermaid flowchart syntax and back. The parser is strict: it accepts only what the emitter produces.
Behavioural equivalence. equivalentOn(reference, candidate, cases) in utilities/equivalence.ts — runs both machines on a set of test inputs, compares outputs and per-step snapshots. A separate module, knows nothing about the graph.
Three layers, three “is it the same?” questions. The rest of this post is about how they diverge.
Flat machine: the boring case
On a flat machine — no composition — all three guises coincide.
Object-wise: one State equals another only if it’s the same instance (===). That’s the basic JavaScript semantics; the engine doesn’t add anything.
Graph-wise: take toGraph, run it through JSON and back, rebuild via fromGraph — you get a twin with the same names and transitions. The match isn’t byte-for-byte but up to node-id renaming: node ids are runtime State ids, reassigned on every fromGraph rebuild. There’s a regression test for this in the repo (turing-machine-js#139). It’s not “looks the same” — it’s the same structure.
Behaviour-wise: equivalentOn runs both machines on the test inputs and confirms “yes, the same”.
The boring case. All three guises coincide because there’s no place where they could diverge. Composition is that place.
Composition: where it gets interesting
Composition in this engine looks like this:
const wrapper = bareState.withOverriddenHaltState(overrideState);
wrapper is a new State. It delegates transition lookup to its bareState, but substitutes the halt-exit: instead of the machine halting, it transitions into overrideState.
Conceptually a wrapper is “run the bare as a subroutine, on exit jump to overrideState”. The composition’s name follows a rule: for a bare named A and an override named B, the wrapper gets the name A(B). The how and why of that exact name is its own article, “Seven arrows, 429 trees” (1).
Here is where the three guises start to diverge.
Object-wise: wrapper and bare are two different States. wrapper !== bareState. That’s the consequence of withOverriddenHaltState returning a new instance.
Graph-wise: the wrapper has no transitions of its own. They live on the bare. If you visualise the graph, the wrapper is a delegation node, not a transition node.
Behaviour-wise: wrapper and bare are not equivalent. They have different halt-exits. On any input the trace matches step for step up to the bare’s halt-exit — and diverges exactly at that moment.
A wrapper exists as a separate object precisely so it can hold that substitution. Next question: how does the graph encode it, so that fromGraph can rebuild it?
How composition lands in the graph
I wanted the graph to carry the composition itself — that kills two birds. First, fromGraph rebuilds the machine straight from the structure instead of guessing the composition from a runtime hint. Second, the graph stays readable: the composition shows up on the diagram itself rather than dissolving into a flat list of transitions. In v7 (turing-machine-js#180) a wrapper emits two nodes:
- Wrapper node — a call-site with no transitions, shape
[[wrapper-name]], itsbareStateIdfield pointing at the bare. - Bare node — a regular node with transitions, shape
[bareState-name], placed inside a Mermaid subgraph “callable subtree of NAME”.
In full, for scanToX(eraseHere):
That’s enough for fromGraph to rebuild the wrapper from the structure: here’s the call-site, here’s the bare it points to. The graph encodes the composition rather than hinting at it.
Under the hood (skippable). The halt-exit of each frame is a separate
isHaltMarkernode with id= -frameId; at runtime it collapses into the singletonhaltState, but on the graph it lives separately, otherwise the whole machine “absorbs” into one halt node. When a single bare is reachable from several wrappers, frames are computed by union-find over bare-reachability (the subgraph is labelled “callable subtree of A” or “callable scope: A ∪ B”). The start of the machine is marked by a sentinelidle([idle])with a dottedenterarrow — an anchor the parser reads to find the start state. Transition labels follow a small format[reads] → [writes]/[moves]— one cell per tape.
Is it the same machine after a round-trip?
For a simple composition — one wrapper, one bare, a tree that doesn’t overlap with others — the round-trip is reproducible up to id renaming. The graph keeps nodes, names, transitions; fromGraph reassembles the same machine. The composite name scanToX(eraseHere) does ride along in the graph — it’s the wrapper node’s label — but fromGraph doesn’t read it: it recomputes the composite from the bare and the override, through the single place where mutating a name is permitted, the Symbol accessor STATE_INTERNAL (turing-machine-js#180). So names never accumulate: on every rebuild the name is computed from the structure, not stacked on top of the previous one. That’s exactly what the regression test on simple wrappers checks.
Reproducibility breaks on a shared bare — when one and the same State serves as the bare of several wrappers. Not a contrivance: in library-binary-numbers the minusOne algorithm is built so that invertNumber ends up as the bare of two wrappers, an outer and an inner one (the full graph). One object, two call-sites.
This doesn’t break the structure: toGraph walks the reachable states exactly once, keyed by instance id, so a shared bare is one node, and the two wrappers just point at it with two arrows; fromGraph rebuilds exactly one shared instance for both wrappers. The shared bare itself survives the round-trip. What doesn’t is the reproducibility of the serialization: a second rebuild can give a graph that won’t line up with the original even if you ignore the ids.
The same shape in miniature, on scanToX — one bare under two wrappers:
Under the hood (skippable). Node ids are runtime instance ids;
fromGraphassigns them fresh. “Reproducible” here means: serialize, rebuild, serialize again, and compare the two graphs, ignoring the id values themselves. For a simple wrapper a secondtoGraphgives the same layout up to renaming. For a shared bare, say one shared by two wrappers, the ids can come out in a different order after the rebuild, and nodes are emitted sorted by id — the layout shifts, and ignoring the ids no longer helps.
And here the graph stops being the answer to lean on. equivalentOn, though, gives allAgree: true on both machines: by behaviour — the same. The boundary is simple: reproducibility of the serialization only for simple wrappers, behavioural equivalence always.
Where equivalentOn lives
equivalentOn(reference, candidate, cases) lives in utilities/equivalence.ts. reference and candidate are Runnables ({ state, getTapeBlock }), cases is an array of test inputs; it runs both machines, compares outputs and per-step snapshots, and returns a report { results, allAgree }: the per-step results plus an overall yes/no in allAgree.
What matters is that it doesn’t live in utilities/stateGraph.ts, where toGraph/fromGraph are. These are different modules, different layers of the library. equivalentOn knows nothing about the graph; it doesn’t care whether a wrapper is emitted as a call-site or as a single node, what frames its transitions live in. It only looks at observable behaviour.
That’s its role. When the reproducibility of toGraph/fromGraph is shaky (a bare shared by two wrappers), equivalentOn answers the question the consumer actually has: “after the rebuild — is this the same machine by behaviour?”. The graph is a serialization tool; behaviour is the contract.
The trap is easy to fall into: checking sameness with === on State instances. After a rebuild that gives false — fromGraph always makes fresh instances — even though the behaviour is identical. A false regression out of nowhere. equivalentOn exists precisely so that doesn’t happen — but only if you know it’s there.
Tags: data rides with the graph, identity stays local
A small but symptomatic v7 addition — tags (turing-machine-js#186). State got an API: state.tag('hot', 'subroutine-entry'), state.tags, state.untag(...). Tags are string labels on an instance. They’re for tooling on top of the machine: toMermaid colours and groups nodes by them via classDef, and consumers like post-machine-js use them to mark subroutine entry points. They don’t affect the machine’s semantics.
What’s important about them for the three guises:
- Behaviourally — they don’t count.
equivalentOndoesn’t look at tags. A machine with tags and a machine without them, on the same transitions, are behaviourally the same. - Graph-wise — they survive round-trip.
GraphNode.tagsis serialised;toMermaidemits tags in the node label via<br>and asclassDeffor hash-based colour grouping in the rendering;fromMermaidreads from the label (<br>is the source of truth,classDefis decorative). - Object-wise — they’re local to the instance. Tags live on a specific
State, not on the bare. Two wrappers share one bare, but each has its own set of tags. Tag one, and both the other wrapper and the bare itself stay untagged.
A clean boundary: data describing a graph node rides with the graph; instance identity stays local to a specific construction. Tags land in the middle: behaviour doesn’t see them (equivalentOn is blind to them), but they outlast === identity — that resets on every rebuild, while tags ride with the graph and survive the round-trip.
What I came away with
I didn’t design the three guises all at once — they accumulated one at a time, and only in hindsight did it become clear that they’re different projections of a single entity, not one stretchy “equals”.
The object guise JavaScript gives for free: === and instanceof just work. The behavioural one I got almost for free — equivalentOn is just running two machines on tests. The graph guise I had to design: without it fromGraph couldn’t restore anything from a serialized machine, and it’s precisely for its sake that v7 packed composition into the graph data.
Two things dawned on me late. Bijective naming of compositions isn’t decoration but a precondition: while names are ambiguous, the graph loses one of the ways of checking for “same” before fromGraph even runs. The article (1) is about this same ambiguity. And behavioural equivalence has to live apart from the graph: equivalentOn knows nothing about toGraph, and that’s right — a consumer more often needs the behavioural answer than a byte-for-byte round-trip.
Tags, too, suggested a simple check: what survives the round-trip describes a graph node, what doesn’t describes a specific instance. When I add a field, I look at which guise it lands in — that’s what decides in what sense it makes two machines “the same.”
Code: turing-machine-js (engine with v7 graph utilities). The interactive demo at demo.machines.mellonis.ru draws machines via toMermaid and highlights the current transition.