How SpecBox Engine Works
Complete spec-driven development pipeline with multi-agent orchestration. 108 MCP tools, 13 skills, 12 agents.
3 commands. Verified software. Auditable evidence.
You describe what to build with /prd. The engine plans with /plan. Implements with /implement. Quality gates verify every step. Self-healing fixes errors automatically. You review the PR.
Real example: from idea to PR in 3 steps
How it works: 3 commands, verified software
$ /prd "Restaurant booking system"
Generating PRD...
┌─ US-001: Book table online
│ ├─ UC-001: Select date and time (4 ACs)
│ └─ UC-002: Cancel booking (2 ACs)
└─ Quality Gate: PASS (12/12 ACs are specific and measurable)
$ /plan
Analyzing PRD (4 US, 12 UC, 38 AC)...
┌─ Phase 1: Database schema + RLS
├─ Phase 2: API (12 endpoints)
├─ Phase 3: UI (Stitch designs)
└─ Phase 4: E2E Playwright
Estimated: ~6h with SpecBox. Without: 3-5 days.
$ /implement
[Orchestrator] Creating branch feat/US-001
[AG-03] Migrations: create_bookings_table ✓
[AG-01] Implementing UC-001 from Stitch design...
[AG-04] Tests: 24/24 passing, coverage 87%
[AG-08] Quality Gate: GO ✓
[AG-09] Acceptance: 12/12 AC ACCEPTED
→ PR #47 created, ready for review
The Pipeline
Generates the Product Requirements Document with User Stories, Use Cases, and Acceptance Criteria. Definition Quality Gate validates each criterion is specific, measurable, and testable.
Analyzes the PRD, generates a technical plan with phases, UI components, and Stitch designs. VEG generates visual directives tailored to the audience.
Autopilot: creates branch, executes sequential phases, design-to-code, quality gates between phases, acceptance testing, and automatic PR.
Deep Dive — Everything inside
12 Specialized Agents
Each pipeline phase has agents with defined roles. The Orchestrator NEVER writes code — it only coordinates, delegates, and consolidates.
Orchestrator
Main coordinator. NEVER writes code. Plans, delegates, consolidates in Engram.
Feature Generator
Generates complete feature structure per stack (BLoC, App Router, FastAPI).
UI/UX Designer
Interfaces, responsiveness, VEG Motion. Works from Stitch designs.
DB Specialist
Supabase, Neon, Firebase. Migrations, RLS policies, schemas.
QA Validation
Unit, integration, widget tests. Coverage 85%+, edge cases.
n8n Specialist
Automation workflows, triggers, webhooks, error handling.
Design Specialist
Google Stitch MCP, VEG enrichment. Generates and edits UI designs.
Apps Script
Google Apps Script (clasp + TypeScript). Web Apps, Add-ons, Triggers.
Quality Auditor
Independent verification. Lint, coverage, architecture. Issues GO/NO-GO.
Acceptance Tester
Generates .feature + Gherkin step definitions. Captures visual evidence (screenshots, traces).
Acceptance Validator
Independent AC validation. Issues ACCEPTED / CONDITIONAL / REJECTED.
Developer Tester
Processes human feedback from manual testing. Creates GitHub issues, links to AC-XX.
13 Agent Skills
Auto-discoverable commands that activate when relevant. Each skill is a complete workflow.
/prd Generates PRD + Work Item /plan Technical plan + Stitch + VEG /implement End-to-end autopilot /quality-gate Adaptive quality gates /feedback Manual testing feedback /explore Read-only exploration /adapt-ui UI component mapping /optimize-agents Agentic system audit /check-designs Retroactive Stitch compliance /acceptance-check Standalone AC validation /quickstart Interactive tutorial (<5 min) /remote Remote management (iPhone/WhatsApp) /release Audit + version bump + push 108 Automation Tools
Unified MCP server. Backend-agnostic: works with Trello, Plane, or locally without external APIs.
Each tool is an atomic operation agents use to manage your project: create PRDs, run tests, move cards, verify quality, generate evidence.
13 modules: engine, plans, quality, skills, features, telemetry, hooks, onboarding, state, spec-driven, migration, stitch, heartbeat.
Quality Gates & Self-Healing
Retry
Automatic retry of the failed step.
Patch
Surgical fix of the detected error.
Rollback
Revert to the last stable checkpoint.
Human Intervention
Escalation to the developer with diagnosis.
Pipeline Integrity
Hook-level enforcement that makes it impossible to write code without an active UC.
"The embed-build incident (March 2026): an agent implemented 9 Use Cases without the pipeline, leaving Trello empty with zero traceability. That was the day HARD BLOCKS were born."
spec-guard.sh blocks Write/Edit to src/ without active UC
commit-spec-guard.sh blocks commits to main/master
design-gate.sh blocks UI without prior Stitch designs
Anti-main guard: FATAL ERROR if implementing on main
Sala de Maquinas
Embedded dashboard (React 19 + Vite) showing the state of all your projects: session telemetry, self-healing events, quality baselines, spec-driven boards, acceptance tests, and E2E results. Each user deploys their own instance — no central server.
Multi-Backend
3 interchangeable backends with the same interface (25 methods). Bidirectional migration between them.
Infrastructure Services
Integrated patterns for 5 services: each with configuration guides, best practices, and pipeline integration.