Read the source. Install what you trust.
Each skill bundle packages a reusable agent behavior — a prompt, supporting files, and evaluation criteria. Browse the public catalog, review the full source, then install a private copy you can edit and experiment with.
Browse bundles
108 published bundles ready to inspect and install
Browser Env Construction
Build instrumented browser environments with action logging and state capture
Repo Level Coding Env
Build environments where agents navigate and modify entire repositories, not just single files
Test Generation As Reward
Use test pass rates as automatic reward signals for code generation
Code Review Reward Design
Score code changes on correctness, style, security, and performance
Code Completion RL Env
Build environments for training code completion models (à la Cursor's online RL)
Experience Replay Management
Maintain and curate experience replay buffers for continual RL training
Distribution Shift Detection
Detect when the production task distribution has drifted from the training distribution
Catastrophic Forgetting Mitigation
Prevent RL training from destroying previously learned capabilities
Online RL From Production
Set up learning loops where production experience feeds back into training
Capability Regression Testing
Run broad capability evals before and after RL training to catch degradation
Overfitting Detection For RL
Detect when RL training narrows capability (great on trained tasks, worse on everything else)
Domain Transfer Measurement
Quantify how much RL training on coding transfers to (say) data analysis or writing