Challenge

Multi-Model Creative Brief Generation Agent

Inspired by Google's Lyria 3 Pro music generation, this challenge focuses on building an advanced multi-model creative assistant using LangChain's LangGraph for complex workflow orchestration. Your system will generate comprehensive creative briefs for music video concepts, requiring a blend of strategic planning and artistic flair. GPT-5 Pro will act as the primary orchestrator, handling high-level creative direction and project management, while Claude 4 Sonnet will be leveraged for specialized tasks like lyrical analysis, mood board descriptions, and script generation. To ensure efficient and cost-effective inference, LocalAI will be used to serve smaller, optimized generative models (e.g., for specific image descriptions or style transfer snippets) on-premise. Hugging Face will manage the deployment, versioning, and inference endpoints for these specialized models, ensuring a robust and scalable architecture. This challenge emphasizes multi-model cooperation, dynamic model routing, and efficient inference management within a creative workflow.

AI DevelopmentHosted by Vera
Status
Always open
Difficulty
Advanced
Points
500
Challenge brief

What you are building

The core problem, expected build, and operating context for this challenge.

Inspired by Google's Lyria 3 Pro music generation, this challenge focuses on building an advanced multi-model creative assistant using LangChain's LangGraph for complex workflow orchestration. Your system will generate comprehensive creative briefs for music video concepts, requiring a blend of strategic planning and artistic flair. GPT-5 Pro will act as the primary orchestrator, handling high-level creative direction and project management, while Claude 4 Sonnet will be leveraged for specialized tasks like lyrical analysis, mood board descriptions, and script generation. To ensure efficient and cost-effective inference, LocalAI will be used to serve smaller, optimized generative models (e.g., for specific image descriptions or style transfer snippets) on-premise. Hugging Face will manage the deployment, versioning, and inference endpoints for these specialized models, ensuring a robust and scalable architecture. This challenge emphasizes multi-model cooperation, dynamic model routing, and efficient inference management within a creative workflow.

Datasets

Shared data for this challenge

Review public datasets and any private uploads tied to your build.

Loading datasets...
Evaluation rubric

How submissions are scored

These dimensions define what the evaluator checks, how much each dimension matters, and which criteria separate a passable run from a strong one.

Max Score: 6
Dimensions
6 scoring checks
Binary
6 pass or fail dimensions
Ordinal
0 scaled dimensions
Dimension 1allsectionspresent

AllSectionsPresent

Ensures the generated creative brief contains all requested sections (e.g., concept, visuals, mood_board_desc).

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 2correctmodelrouting

CorrectModelRouting

Verifies that the appropriate model (GPT-5 Pro, Claude 4 Sonnet, LocalAI-served) is selected for at least 80% of test tasks.

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 3localaiisused

LocalAIIsUsed

Confirms that at least one task within a full creative brief generation successfully utilized a LocalAI-served model.

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 4creativecoherencescore

CreativeCoherenceScore

An average score representing how well different sections of the brief align conceptually and stylistically. • target: 0.8 • range: 0-1

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 5contentdensityscore

ContentDensityScore

Measures the richness and detail provided in key sections of the creative brief. • target: 0.75 • range: 0-1

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 6averageinferencelatency

AverageInferenceLatency

Average time taken for critical inference steps involving LocalAI-served models. • target: 100 • range: 0-500

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Learning goals

What you should walk away with

  • Master LangChain's LangGraph for building directed acyclic graph (DAG) workflows that manage state transitions and agent interactions across multi-step creative processes.

  • Integrate GPT-5 Pro as the 'Creative Director' agent, utilizing its advanced reasoning for conceptualizing overarching themes, target audience, and project goals.

  • Leverage Claude 4 Sonnet as specialist 'Lyric Analyst' and 'Visual Storyteller' agents, focusing on detailed content generation, emotional tone, and visual cues.

  • Implement LocalAI to serve task-specific generative models (e.g., image style generators, short text variations) within agent tools for rapid and controlled inference.

  • Utilize Hugging Face Hub and Endpoints for deploying, versioning, and managing access to various fine-tuned generative models used by LocalAI and other agents.

  • Design dynamic model routing logic within LangChain to intelligently select between GPT-5 Pro, Claude 4 Sonnet, or LocalAI-served models based on task complexity, cost, and desired output style.

  • Build custom LangChain tools that encapsulate calls to LocalAI-served models and Hugging Face inference endpoints, enabling agents to use these resources effectively.

Start from your terminal
$npx -y @versalist/cli start multi-model-creative-brief-generation-agent

[ok] Wrote CHALLENGE.md

[ok] Wrote .versalist.json

[ok] Wrote eval/examples.json

Requires VERSALIST_API_KEY. Works with any MCP-aware editor.

Docs
Manage API keys
Host and timing
Vera

AI Research & Mentorship

Starts Available now
Evergreen challenge
Your progress

Participation status

You haven't started this challenge yet

Timeline and host

Operating window

Key dates and the organization behind this challenge.

Start date
Available now
Run mode
Evergreen challenge
Explore

Find another challenge

Jump to a random challenge when you want a fresh benchmark or a different problem space.

Useful when you want to pressure-test your workflow on a new dataset, new constraints, or a new evaluation rubric.

Tool Space Recipe

Draft
Action Space
GPT-5 ProModels · Large Language Models
required
LangChainFramework for building LLM applications
Policy Serving
GPT-5
Orchestration
LangChainFramework for building LLM applications
Evaluation
Rubric: 6 dimensions
·AllSectionsPresent(1%)
·CorrectModelRouting(1%)
·LocalAIIsUsed(1%)
·CreativeCoherenceScore(1%)
·ContentDensityScore(1%)
·AverageInferenceLatency(1%)
Gold items: 2 (2 public)

Frequently Asked Questions about Multi-Model Creative Brief Generation Agent