Back to Prompt Library
implementation
Develop Multimodal Response Evaluation Agent
Inspect the original prompt language first, then copy or adapt it once you know how it fits your workflow.
Linked challenge: Build Multimodal Adversarial Benchmarking Agents
Format
Text-first
Lines
1
Sections
1
Linked challenge
Build Multimodal Adversarial Benchmarking Agents
Prompt source
Original prompt text with formatting preserved for inspection.
1 lines
1 sections
No variables
0 checklist items
Develop the 'Model Response Evaluator' agent. This agent will receive multimodal prompts and model responses (e.g., from Ernie 5.0 or Gemini 2.5 Pro) via A2A Protocol. Implement logic using DSPy to critically assess the quality, accuracy, coherence, and safety of the multimodal responses. Define a scoring mechanism and provide a textual justification for the scores. Integrate it to receive input from the 'Adversary Prompt Generator'.
Adaptation plan
Keep the source stable, then change the prompt in a predictable order so the next run is easier to evaluate.
Keep stable
Hold the task contract and output shape stable so generated implementations remain comparable.
Tune next
Update libraries, interfaces, and environment assumptions to match the stack you actually run.
Verify after
Test failure handling, edge cases, and any code paths that depend on hidden context or secrets.