Read the source. Install what you trust.
Each skill bundle packages a reusable agent behavior — a prompt, supporting files, and evaluation criteria. Browse the public catalog, review the full source, then install a private copy you can edit and experiment with.
Browse bundles
108 published bundles ready to inspect and install
RL Failure Postmortem
Diagnose why an RL training run failed and what to change
RL VS Prompting Decision
Determine when prompt engineering, fine-tuning, or RL is the right approach
RL Cost Estimation
Estimate total cost (compute, data, engineering time) for an RL project
RL Paper Reading
Read and critically evaluate RL research papers, extract practical implications
RL Experiment Design
Plan RL experiments: baselines, ablations, compute budgets, success criteria
Document Processing Env
Environments for extraction, classification, and transformation of business documents
Ticket Triage Env
Environments for support ticket routing, prioritization, and resolution
Crm Workflow Env
Environments mimicking CRM operations (Salesforce, HubSpot)
Data Pipeline Env
Environments for building and debugging ETL/ELT pipelines
SQL Generation RL Env
Build environments where agents write SQL, execute it, and get scored on result correctness
UI Task Specification
Formally specify UI tasks with clear start states, goal states, and evaluation criteria
Pixel VS Dom Action Space
Trade-offs between pixel-level interaction and DOM-level interaction for UI agents