Read the source. Install what you trust.
Each skill bundle packages a reusable agent behavior — a prompt, supporting files, and evaluation criteria. Browse the public catalog, review the full source, then install a private copy you can edit and experiment with.
Browse bundles
109 published bundles ready to inspect and install
UI Task Specification
Formally specify UI tasks with clear start states, goal states, and evaluation criteria
Pixel VS Dom Action Space
Trade-offs between pixel-level interaction and DOM-level interaction for UI agents
Browser Env Construction
Build instrumented browser environments with action logging and state capture
Repo Level Coding Env
Build environments where agents navigate and modify entire repositories, not just single files
Test Generation As Reward
Use test pass rates as automatic reward signals for code generation
Code Review Reward Design
Score code changes on correctness, style, security, and performance
Code Completion RL Env
Build environments for training code completion models (à la Cursor's online RL)
Experience Replay Management
Maintain and curate experience replay buffers for continual RL training
Distribution Shift Detection
Detect when the production task distribution has drifted from the training distribution
Catastrophic Forgetting Mitigation
Prevent RL training from destroying previously learned capabilities
Online RL From Production
Set up learning loops where production experience feeds back into training
Capability Regression Testing
Run broad capability evals before and after RL training to catch degradation