Skill Bundles

Read the source. Install what you trust.

Each skill bundle packages a reusable agent behavior — a prompt, supporting files, and evaluation criteria. Browse the public catalog, review the full source, then install a private copy you can edit and experiment with.

Published bundles

109

Total installs

2

Average quality

70/100

Browse bundles

109 published bundles ready to inspect and install

Skill bundlev1.0.0

Domain Specific Eval Design

Build evals for specialized verticals (legal, medical, finance, engineering)

Compatibility not listed

Skill bundlev1.0.0

Eval Contamination Prevention

Ensure training data and eval data don't overlap

Compatibility not listed

Skill bundlev1.0.0

Adversarial Eval Generation

Create evals specifically designed to find failure modes and edge cases

Compatibility not listed

Skill bundlev1.0.0

Eval Saturation Detection

Identify when a model has maxed out an eval and needs harder/different benchmarks

Compatibility not listed

Skill bundlev1.0.0

Eval Coverage Analysis

Measure whether your eval suite covers the actual distribution of production tasks

Compatibility not listed

Skill bundlev1.0.0

Build Fuzzy Eval

Design evals for tasks with multiple valid solutions (writing, design, open-ended code)

Compatibility not listed

Skill bundlev1.0.0

Build Deterministic Eval

Create evals with unambiguous, programmatically verifiable correct answers

Compatibility not listed

Skill bundlev1.0.0

Outcome VS Process Reward Tradeoff

When to reward final results vs. intermediate steps, and how to blend both

Compatibility not listed

Skill bundlev1.0.0

Reward Calibration

Ensure reward functions produce consistent, well-scaled signals across different task types and difficulties

Compatibility not listed

Skill bundlev1.0.0

Human Feedback Collection

Design interfaces and protocols for collecting human preference judgments at scale

Compatibility not listed

Skill bundlev1.0.0

Reward Hacking Detection

Identify when agents exploit reward function loopholes to get high scores without doing the task correctly

Compatibility not listed

Skill bundlev1.0.0

Reward Shaping

Add intermediate reward signals that guide learning without changing the optimal policy

Compatibility not listed

Page 7 of 10