Three majors, two mistakes: designing a pause API for a Turing-machine interpreter

Two design mistakes in a Turing-machine pause API, and the three major releases it took to get them out.

Screenshot of demo.machines.mellonis.ru running a Turing machine

I spent the last two weeks shipping four breaking major versions of @turing-machine-js/machine — v3, v4, v5, v6 — and the most interesting part wasn’t any single feature. It was watching the same API surface (a pause/breakpoint hook on the run loop) get redesigned twice in three versions, each time because the previous shape exposed something it shouldn’t have.

This post is the post-mortem. If you’re designing pause/step/breakpoint APIs for a generator-loop interpreter, scheduler, or DSL runtime, the mistakes I made are easy to make and worth seeing in someone else’s code first.

The setup

The engine runs a Turing machine. Internally it's a generator: each yield corresponds to one step (one transition firing on one tape symbol). The driver loop looks roughly like this:

async function run({ initialState, onStep }) {
  for (const m of runStepByStep({ initialState })) {
    await onStep?.(m);
    if (m.state.isHalt) return;
  }
}

MachineState (m) is a snapshot per iteration: the state about to execute, the current and next symbols, the moves the heads will make. A consumer (a logger, a UI, a test) can hang off onStep to observe.

What I wanted to add in v4 was a way to pause execution at chosen points — like a breakpoint in a debugger. Any State should be able to carry a debug config:

myState.debug = { before: [symA], after: [symA] };

...and the run loop should give the consumer a chance to inspect the machine and decide when to resume.

v4: ship it

The v4 shape was straightforward. I made run() async, added an onDebugBreak hook, and routed it like this:

await machine.run({
  initialState,
  onStep: (m) => { /* every step */ },
  onDebugBreak: async (m) => {
    if (m.debugBreak?.before) console.log('before:', m.state.name);
    if (m.debugBreak?.after)  console.log('after:',  m.state.name);
    // hold here until promise resolves
  },
});

The per-state debug field is mutable. state.debug = { before: true } sets a pre-step break; before: [symA] filters by what the head is reading. The hook is awaited, so any consumer can implement “freeze until the human clicks Resume” by simply not resolving the promise.

I also wanted breakpoints on halt:

haltState.debug = { before: true };  // pause before exit

That worked. Symbol-list filters on haltState were silent no-ops, because haltState has no head symbol to filter by. Fine, I thought — be permissive in input, the wildcard true is the one that matters.

Two things in this design were wrong. I didn’t notice either until I started writing docs and a UI on top.

Mistake 1: the hook describes the engine, not the consumer

onDebugBreak reads, on paper, like a perfectly reasonable name. The engine fires a “debug break”; you hook into it.

But what does the consumer do in that hook? They pause. They inspect. They wait for input. They resume. The hook isn’t really notifying you that a thing happened — it’s offering you a cooperation point.

The name onDebugBreak carries one specific framing: “debugging”. But the same hook is exactly what you want for a step-through visualization, a slow-motion playback control, an animation tween, a “press space to advance” UI. None of those are debugging. They’re all pausing.

This sounds like a small thing. It’s not, because once consumers see the name they design around it: their code path is called handleDebugBreak, their state machine has a debugging boolean, their UI button says “Stop debugging”. The name leaks into every consumer’s vocabulary.

I wrote an RFC (turing-machine-js#109) and renamed it in v5. Hard rename, no deprecation alias:

await machine.run({
    initialState,
-   onDebugBreak: (m) => { ... },
+   onPause: (m) => { ... },
  });

The payload field m.debugBreak stayed — it’s metadata describing why this yield fired ({ before: true } or { after: true }), and “break” works fine inside the payload. But the hook name is the consumer’s contract: onPause describes what they do, not what the engine does.

Rule for next time: when naming a hook, name the consumer’s verb, not the engine’s event. Hooks are cooperation points, not event listeners.

Restriction: `haltState.debug.after` is nonsense

The second v4 mistake hid under an instinct to be permissive in input. haltState.debug.after accepted writes silently. It just didn’t fire.

Why? Because the after semantics in this engine mean “fire on the iteration after the one where the filter matched”. That makes perfect sense for normal states — you transition out of state K, the next iter (K+1) runs, and at that yield the consumer is told “by the way, K’s after-filter fired last step”. But halt is terminal. There is no K+1 after halt. The after event has nothing to anchor on.

I’d silently swallowed haltState.debug.after = true in v4. Worse, I’d let { before: true, after: true } through — half the assignment was meaningful, half was a no-op, and nothing told the consumer.

In v5 I made both throw at write time (turing-machine-js#108 part 2):

- haltState.debug = { before: true, after: true }; // v5: throws — 'after' on halt has nothing to anchor on
+ haltState.debug = { before: true };

Rule for next time: if a configuration has no semantics, throw early. Silently no-op input looks user-friendly until a consumer spends an afternoon wondering why their UI doesn’t fire on halt. “Be permissive” is a false economy when permissiveness silences a real mistake.

While I was in there, I also fixed a related bug: the halting iter’s own after filter wasn’t firing either (turing-machine-js#108 part 1). The driver loop exited as soon as state.isHalt became true — and the after-event from the previous iter, which would normally fire on this final yield, got dropped on the floor. v5 added a post-loop drain to fire it.

I’ll come back to this.

And one v5 bonus: a master switch on run() for the whole pause system.

await machine.run({ initialState, debug: false, onPause: ... });

When false, all pause-fires are suppressed regardless of what state.debug says across the graph (turing-machine-js#106). You can A/B “debug mode” without rewriting the graph or clearing every state.debug field. The flag dispatches onPause; the underlying m.debugBreak payload field still populates on the generator’s yields (it’s a property of the iteration, not of how run() chose to surface it).

Mistake 2: the substitution dance

This is the one I’m proudest to have caught, because the code worked, the tests passed, and the docs were correct. The shape was just wrong.

In v4 and v5, after K (the after-fire for iteration K) actually fired on iteration K+1’s yield. The driver loop carried a pendingAfterFromPrev flag across yields, and on the next iter’s yield it dispatched the after hook first, then the before hook, then onStep. The hook for K’s after-fire had to see K’s state — not K+1’s. So I substituted the previous yield’s snapshot into the m.state field of K+1’s yield, just for the duration of the after dispatch.

This worked. But it had three knock-on effects:

The substitution leaked. Consumers wanted access to the un-substituted, “real” iteration state (the one the step hook saw) — see turing-machine-js#107. Some users were reading raw MachineState.debugBreak from runStepByStep directly and were surprised that m.state referred to the iter on which the after-fired-from, not the iter that produced it.
The halt case needed a special path. As mentioned: the halting iter’s own after fires after the loop exits. v5 added a post-loop drain to handle it. That drain is its own code path, with its own substitution, separate from the in-loop dispatch.
The generator’s return type widened. Because the post-loop drain needed to return a final yielded value out of the iterator, runStepByStep’s return type became Generator<MachineState, MachineState | null> — a yield type and a return type. The canonical for..of consumer doesn’t see the return value, but anyone reading the type signature now had to understand why there were two slots.

All of this came from one design choice: dispatching after K on iter K+1’s yield instead of on iter K’s own yield.

In v6 I collapsed it (turing-machine-js#119). The per-iter lifecycle is now plainly before → step → after. The after fire rides on the same yield as the iter that produced it. No substitution. No pendingAfterFromPrev. No post-loop drain. The generator return type narrows back to Generator<MachineState>.

// v6
await machine.run({
  initialState,
  onStep: (m) => { /* unchanged */ },
  onPause: (m) => {
    if (m.debugBreak?.before) console.log('before:', m.state.name);
    if (m.debugBreak?.after)  console.log('after:',  m.state.name);
  },
});

The dispatch order across hooks changed (any test asserting pause(after K-1) → pause(before K) → step(K) had to flip to pause(before K) → step(K) → pause(after K)), but the set of dispatched calls and per-iter semantics are unchanged. Consumers treating the hooks as independent observers see no change at all (turing-machine-js#107 — “expose un-substituted state” — disappeared as a problem: there’s no substitution to expose around).

Rule for next time: if your hook payload requires substituting state from a different iteration than the one that’s yielding, the dispatch sequence is wrong. Lifecycle phases (before, step, after) belong on the iter they describe. The substitution wasn’t a clever trick — it was a leak telling me the events were on the wrong tick.

What I’d carry to the next pause/breakpoint API

Name hooks for the consumer's verb, not the engine's event. onPause, not onDebugBreak. onProgress, not onChunkReceived. The name leaks into every consumer's mental model — pick the framing you want them to inherit.
Throw on impossible configurations early. “Permissive in input” is fine for shapes that might be meaningful; it's a trap for shapes that can't be. haltState.debug.after = true had no possible semantics. Silently swallowing it cost an afternoon to one user before I noticed.
Per-iter phases belong on the iter they describe. If you're tempted to dispatch after K on K+1's yield, ask what payload you have to substitute to make it look right. The substitution is the signal.
A master switch is cheap and worth it. A boolean on the entry point to flip the whole feature off (without rewriting graph-level flags) makes the feature safe to ship to production and toggle in tests. run({ debug: false }) was one of the smaller v5 adds and one of the most-used by downstream consumers.
Breaking changes are worth the major bump. Three majors in two weeks sounds expensive. It isn't, if the lockstep downstream is one repo (in my case @post-machine-js/machine, which bumped its peer dep ^4.0.0 → ^6.0.0 and renamed its own internal __onDebugBreak → __onPause in step). API ergonomics compound across every consumer for the life of the library — pay the migration cost while the API is young.

The v6 shape is the one I should have shipped in v4. I didn't have the vocabulary then — I was thinking “expose the engine's events” instead of “design the consumer's contract”. Writing the docs is what surfaced both mistakes. There's a meta-lesson in there about how docs are the cheapest design review you can run, but that's another post.

Code: turing-machine-js (engine) and post-machine-js (a Post machine built on top, kept in lockstep). The interactive demo at demo.machines.mellonis.ru consumes the v6 onPause hook for its Step / Run / Stop controls.