Temper — an entropy-gated loop runner for AI coding

built under its own discipline

Every number here is verifiable.

Temper is developed under its own gates. You can verify every number here yourself: clone the repo and run one command.

third-party dependencies. A few small files of plain Node.

cat package.json

23/23

golden fixtures gate the harness itself, deterministically

npm test

false-flip rate. The one LLM gate, measured on known cases.

npm run critic-check

Stress-tested on a 2,935-file production codebase.

the problem

Agents are fast. Left alone, they're also messy.

They re-implement what already exists, leave dead code behind, widen the scope of a change, and quietly silence the linter to make an error go away.

Temper puts a deterministic gate between the agent and your git history. Work that introduces new entropy never gets committed. It gets re-prompted with the evidence, or, if a problem stays stuck, escalated to you instead of burning iterations.

Deterministic by default. LLMs only for irreducible judgment.

Every gate is plain, fast code, except the one call a machine can't make reliably: "did this re-implement something that already exists?"

how it works

One Plan in. Either a clean commit, or a question for you.

Plan — you approve

↓

engine implements (claude -p / codex exec)

↓ deterministic gates

scope-lock protected regions fallow: dead code · complexity · dupes suppression guard your tests hidden held-out check reuse-critic

↓

✓ all green → commit ✗ violation → re-prompt, or escalate

The gates run cheapest-first. A violation is shown in full and fed back as root-cause feedback for the next attempt. If a single failure-domain recurs, Temper stops and hands it to you, rather than quietly burning through its iteration budget.

Nothing is committed unless every gate is green. What lands in your history is measurably clean.

the method

How it gets better results from agents

None of this is novel. These are established practices from teams shipping with AI, and Temper enforces them in code so they hold on every run.

01 Review the plan up front

The cheapest place to catch a bad change is before it exists. Temper has the agent draft a Plan (scope, acceptance, and the assumptions it rests on) for you to approve first. Catching a wrong approach in a one-page Plan is cheaper than catching it in a thousand-line diff.

02 Loop the agent, gate every commit

Let it iterate freely. Let nothing into your history that didn't pass deterministic checks. The loop is the cheap part; the gate (scope, dead code, duplication, your tests) is what makes it safe to leave running.

03 Turn failures into feedback, then escalate

Each violation is fed back as specific, root-cause evidence for the next attempt, not a blind "try again." If one problem keeps recurring unchanged, Temper stops and hands it to you rather than burning its iteration budget.

04 Keep judgment irreducible

Everything that can be checked is checked by plain, fast code. The one call a machine can't make reliably ("did this re-implement something that already exists?") gets an LLM, with guardrails measured against the rates at which AI judges actually fail.

mode b · the overnight queue

Hand it a night's work. Wake up to a report.

Queue an ordered sequence of Plans and let Temper work through them unattended overnight. If a run is interrupted, it resumes from where it stopped.

Survives the subscription cap. Detects the rate limit, waits for the reset, resumes. The cap, not the clock, is the ceiling.
Branch-isolated, never auto-merged. Runs on its own branch; your base is untouched; you review and merge in the morning.
Sequential, one task at a time. No parallel fan-out, so each phase builds on the last one's reviewed, committed result.
A morning report. What committed, what stopped it and why, and what is left. It is reconstructed from a ledger, so it is intact even if you closed the terminal.
Resumable. Fix whatever stopped it and re-run; it skips what already landed.

.temper/report.md

# Temper run report

- Queue:   .temper/phases
- Branch:  temper/phases (from main; NOT merged)
- Outcome: all-green

| # | phase       | status    | commit    |
|---|-------------|-----------|-----------|
| 1 | Add slugify | committed | 0d8515ba1 |
| 2 | uniqueSlug  | committed | cce8e3f9d |
| 3 | Public API  | committed | 6c17ebbe8 |

Committed: 3/3 — review temper/phases, then merge.

get started

It runs in your terminal, on the repo you point it at.

bash

# one-time: clone, then put `temper` on your PATH
git clone https://github.com/michaelrowejones/Temper && cd Temper && npm link

# from inside the repo you want to work on:
temper init                       # entry-point-aware gate config
temper plan "add a foo widget"    # draft a Plan from the codebase
$EDITOR ./PLAN.md                # review + approve it
temper run ./PLAN.md              # run it to a green gate

Temper must run in your own terminal. It cannot run inside a host-managed session, because nested credentials return a 401 (ADR-0003).

AI agents are fast. They also leave a mess and ignore the structure you built.