Prompt Diff Viewer

Why diff prompts at all

Small wording changes have outsized effects on model behaviour. "Be concise" and "Respond concisely" can produce noticeably different output styles. Reviewing the exact delta between a working prompt and a proposed change forces you to think about why each line is there.

Pair this tool with an eval set: when you change the prompt, run the eval, decide if the change is worth shipping. Over time the "diff + eval" loop is how prompt quality accumulates.

FAQ

Why a separate diff tool — can't I use git diff?: You can. But prompts often live as runtime-assembled strings, not files in a repo. Pasting two snapshots into one place lets you compare without committing or temporarily branching.
Word-level highlighting?: Not yet. Line-level diff catches the structural changes (added section, removed instruction) which is what matters for prompt iteration. Word-level adds noise without much signal.
How big can the inputs be?: Practical limit is a few thousand lines per side — the LCS algorithm is O(M×N). For full long-context system prompts (e.g., 100k tokens) it's still snappy because line counts stay modest.

Why diff prompts at all

FAQ

Related tools