Read the source. Install what you trust.
Each skill bundle packages a reusable agent behavior — a prompt, supporting files, and evaluation criteria. Browse the public catalog, review the full source, then install a private copy you can edit and experiment with.
Browse bundles
108 published bundles ready to inspect and install
Sop To Task Parser
Convert natural language SOPs and runbooks into structured, machine-executable task specifications
Task Difficulty Calibration
Score and bucket tasks by difficulty using baseline agent performance
Edge Case Mining
Extract rare but high-impact failure modes from production logs to create targeted task sets
Curriculum Design
Order tasks by difficulty, introduce new complexity dimensions progressively
Generate Task Variations
Programmatically produce 10K–100K+ task instances from templates, SOPs, and historical logs
Instrument Action Space
Define, constrain, and document the valid action space an agent can take within an environment
Build Stateful Env
Handle environments with persistent state across episodes (databases, file systems, user sessions)
Build Multi Tool Env
Compose environments spanning multiple tools (IDE + terminal + browser + DB) into a single coherent action space
Build CLI Env
Create terminal/shell environments with filesystem state, command history, and outcome verification
Build Codebase Env
Set up repo-level coding environments with test harnesses, linting, compilation feedback loops
Build API Harness
Wrap real or mock APIs into instrumented RL-ready surfaces with deterministic reset, state capture, and action logging
Build UI Sandbox
Construct browser-based sandboxed environments where agents interact with realistic UI surfaces (forms, dashboards, multi-step wizards)