Read the source. Install what you trust.
Each skill bundle packages a reusable agent behavior — a prompt, supporting files, and evaluation criteria. Browse the public catalog, review the full source, then install a private copy you can edit and experiment with.
Browse bundles
108 published bundles ready to inspect and install
Composite Reward Design
Combine multiple reward signals (correctness, efficiency, style, safety) into a single scalar
LLM As Judge Reward
Use a language model to score agent outputs against specifications or rubrics
Graded Rubric Reward
Translate qualitative rubrics into multi-dimensional scoring functions with partial credit
Binary Outcome Reward
Design pass/fail reward signals (code compiles, test passes, form submitted correctly)
Offline Dataset Curation
Build high-quality static datasets from historical trajectories for offline RL or behavior cloning
Trajectory Format Standardization
Convert heterogeneous log formats into a unified trajectory schema (state, action, reward, metadata)
Trajectory Anonymization
Strip PII, credentials, and sensitive business data from trajectories while preserving RL-relevant structure
Trajectory Filtering
Score and filter trajectories by quality, remove corrupted/incomplete episodes
Capture Agent Trajectories
Log agent rollouts with full state-action-reward-next_state tuples, tool calls, and timing
Capture Human Trajectories
Instrument production tools to log human expert actions, states, and outcomes as RL-ready trajectories
Multi Step Task Decomposition
Break complex enterprise workflows into subtask chains with intermediate checkpoints
Synthetic Data Augmentation
Generate realistic variations of workflow data (user inputs, edge cases, adversarial inputs) without real PII