Adapts the balanced-team methodology and XP engineering practices that shipped software for decades — for a world where the builders are AI agents. Discovery & Framing, self-contained stories, adversarial review, evidence-based acceptance, durable knowledge, and hard workflow enforcement turn raw model capability into disciplined delivery.
Available now for Claude Code, Codex, OpenCode, and Pi. Pi supports fully local model routing.
The Problem
Left unconstrained, AI coding agents exhibit predictable failure modes that compound over time.
Skip testing entirely or write tests that verify nothing meaningful. Mocks everywhere, no real integration.
Build components that work alone but never integrate. Vertical slices replaced by horizontal layers.
Lose context across sessions and compaction. Make contradictory decisions after forgetting earlier constraints.
Ignore business requirements in favor of technically interesting work. Build what's fun, not what's needed.
Mark work as "done" without proof it actually works. Claim success without running real tests against real services.
Let one agent discover, design, implement, and approve its own work. No productive tension, no adversarial review, no trustworthy acceptance.
How It Works
The dispatcher coordinates the full Paivot choreography: Discovery & Framing, optional specialist challenge loops, backlog creation, adversarial review, execution, milestone validation, and retrospective learning. On Claude Code, Codex, and OpenCode the queue selection, story transitions, merge gates, and recovery path are delegated to pvg so the workflow does not depend on prompt memory alone.
| Traditional Balanced Teams | Paivot |
|---|---|
| Persistent human teams | Ephemeral agents, spawned per task |
| Pair programming | Orchestrated dispatch with PM review |
| Implicit shared context | Self-contained stories with ALL context embedded |
| Trust-based review | Evidence-based delivery with recorded proof |
| Centralized project tracker | nd story contracts plus vault-backed knowledge |
| Organic learning through pairing | Structured evidence, proof, and retro learnings captured deliberately |
| Flexible role boundaries | Strict enforcement (agents lack judgment to flex) |
| Prompt-based reminders | Guarded or native orchestration, depending on platform |
Specialized Agents
Paivot is larger than a developer and a reviewer. The full system includes the dispatcher, discovery roles, specialist challengers, backlog shaping, adversarial review, execution, and retrospective learning.
Coordinates the entire workflow, routes tasks to the right persona, enforces choreography, and never writes code or backlog content itself.
Captures business outcomes through iterative questioning. Owns BUSINESS.md. Asks "what does success look like?"
Captures user needs and DX for all product types: UI, API, CLI, database. Owns DESIGN.md.
Designs system architecture and defines technical constraints. Owns ARCHITECTURE.md.
Adversarially reviews BUSINESS.md for omissions, drift, and ambiguity before the backlog is allowed to form.
Adversarially reviews DESIGN.md so weak UX, API, CLI, or DX assumptions are surfaced before execution.
Adversarially reviews ARCHITECTURE.md to catch feasibility gaps, drift, and hallucinated constraints.
Creates the backlog from D&F documents. Embeds ALL context into self-contained stories so agents need nothing else.
Adversarial reviewer. Challenges the backlog for gaps, missing walking skeletons, non-demoable milestones. Not here to be helpful — here to be thorough.
Ephemeral. Implements one story, runs tests, records proof of passing in delivery notes. Does NOT close stories.
Reviews one delivered story using evidence-based approach. Accepts (closes) or rejects with structured EXPECTED/DELIVERED/GAP/FIX notes.
Harvests learnings from completed epics. Writes actionable knowledge to the vault so future sessions start smarter instead of repeating mistakes.
Specialist challengers are optional but first-class. In Pi, the highest-leverage roles can use stronger hosted or local reasoning models while narrow coding stories can be delegated to smaller local models without weakening the role contract. intake is an operator-facing entry workflow, not a long-lived peer persona, so it is intentionally shown through commands rather than as an agent card.
Enforcement Layer
"The orchestrator cannot be trusted to improvise the process."
LLMs can be persuaded, distracted, or compacted into forgetting discipline. Paivot treats orchestration as a system concern, not just a prompt. Claude Code, Codex, and OpenCode share a deterministic pvg control plane for queue selection, story transitions, merge gating, and recovery; Pi implements the dispatcher natively inside the runtime.
pvg loop next --json decides what happens next, pvg story deliver|accept|reject owns the structural transitions, merges stay blocked until a story is both accepted and closed, and pvg loop recover is the break-glass recovery path after interruption.
Invalid merges, unsafe vault writes, broken branch choreography, and out-of-sequence workflow actions are blocked before they do damage.
Workflow state survives compaction, session restarts, tool failures, and provider changes because it lives in the workflow system, not the model's context window.
Story contracts record evidence, proof, status transitions, and rejection history so every acceptance or rollback has an explicit reason.
Implementations can assign stronger models to backlog and adversarial roles and smaller models to narrow coding tasks without weakening the delivery contract.
Delivery Pipeline
No shortcuts. Verification before review. Evidence before acceptance. Learnings before forgetting.
Mocks in integration tests are an automatic rejection. Only real calls prove functionality, and milestones must be demoable end to end.
| Type | Mocks? | Required |
|---|---|---|
| Unit | OK | 80% coverage |
| Integration | Never | Every story |
| E2E | Never | Milestones |
PM-Acceptor reviews what was proved, not what was promised. Every rejection must include four parts:
| Part | Purpose |
|---|---|
| Expected | Quote the AC |
| Delivered | What code does |
| Gap | Where it falls short |
| Fix | Actionable guidance |
AI agents do not learn by osmosis. Knowledge has to be captured, stored, and deliberately reintroduced into later work.
| Stage | Actor |
|---|---|
| Record | Developer notes |
| Flag | PM labels stories |
| Harvest | Retro agent |
| Incorporate | Sr. PM (hard-gated) |
Foundation
nd contracts and vault memory keep the system honestA rigorous methodology needs durable memory. Paivot uses nd for backlog and delivery contracts, plus vault-backed knowledge for decisions, patterns, and retrospective learnings that survive session loss and model swaps.
nd provides a git-native, CLI-first story tracker that agents can actually use. Each story carries status, evidence, proof, dependencies, rejection history, and merge readiness in a durable contract.
vlt provides the persistent knowledge layer: system vault, project vault, and session capture. Decisions, debugging insights, and retro learnings stop being tribal knowledge and become reusable context for the next agent.
Delivery Contract
Developers implement exactly one story, append evidence and proof, and mark it delivered. PM-Acceptor accepts or rejects. Branch merges stay blocked until the story is both accepted and closed.
## nd_contract status: delivered ### evidence - npm test - git rev-parse HEAD ### proof - [x] AC #1: export produces valid JSON
That contract is what lets Paivot keep rigor even when the models, providers, or runtimes change underneath it.
Getting Started
Same methodology, different integration surfaces. Claude Code, Codex, and OpenCode all share pvg as the deterministic control plane; Pi is the native implementation and can run entirely on local models through LM Studio or other OpenAI-compatible endpoints.
Plugin surface with hooks and strong guardrails, backed by the same pvg control plane used by the other hosted runtimes.
# Prereqs: pvg, vlt, Claude Code git clone https://github.com/paivot-ai/paivot-graph.git cd paivot-graph && make install make seed
Codex-native skills and orchestration prompts, with shared queue control and story transitions delegated to pvg.
# Install globally git clone https://github.com/paivot-ai/paivot-codex.git cd paivot-codex && make install-global make check-prereqs
OpenCode commands and agent files adapted to its architecture, still backed by nd, vlt, and the same shared pvg control plane. This is the most portable hosted surface and works well with strong OSS coding models too.
# Bootstrap OpenCode git clone https://github.com/RamXX/paivot-opencode.git cd paivot-opencode && make install make install-project TARGET=/path/to/your-project
Native Paivot runtime with per-role model routing, built-in guardrails, and a benchmark harness for quality, latency, retries, speed, and cost. Can run fully local.
# Native Pi workflow git clone https://github.com/paivot-ai/paivot-pi.git cd paivot-pi && cp .env.example .env pi /paivot
Platform Support
Paivot is platform-aware, not platform-fragile. The role system, story contracts, review rigor, and knowledge model stay consistent while each runtime gets the integration surface it can actually support. Claude Code, Codex, and OpenCode now converge on the same deterministic pvg workflow core.
Mature plugin workflow with commands, hook integration, vault seeding, and strong unattended execution support, all backed by pvg.
Codex-native skills with the same rigorous backlog, delivery, and acceptance choreography, using pvg for shared queue selection, transitions, and recovery.
OpenCode-adapted dispatcher workflow with nd, vlt, and the same pvg control plane, making it a strong hosted option for top OSS coding models.
Native orchestrator with per-role model routing, benchmark tooling, and the option to run the full methodology entirely on local models.