Multi-Model Creative Brief Generation Agent
Inspired by Google's Lyria 3 Pro music generation, this challenge focuses on building an advanced multi-model creative assistant using LangChain's LangGraph for complex workflow orchestration. Your system will generate comprehensive creative briefs for music video concepts, requiring a blend of strategic planning and artistic flair. GPT-5 Pro will act as the primary orchestrator, handling high-level creative direction and project management, while Claude 4 Sonnet will be leveraged for specialized tasks like lyrical analysis, mood board descriptions, and script generation. To ensure efficient and cost-effective inference, LocalAI will be used to serve smaller, optimized generative models (e.g., for specific image descriptions or style transfer snippets) on-premise. Hugging Face will manage the deployment, versioning, and inference endpoints for these specialized models, ensuring a robust and scalable architecture. This challenge emphasizes multi-model cooperation, dynamic model routing, and efficient inference management within a creative workflow.
What you are building
The core problem, expected build, and operating context for this challenge.
Inspired by Google's Lyria 3 Pro music generation, this challenge focuses on building an advanced multi-model creative assistant using LangChain's LangGraph for complex workflow orchestration. Your system will generate comprehensive creative briefs for music video concepts, requiring a blend of strategic planning and artistic flair. GPT-5 Pro will act as the primary orchestrator, handling high-level creative direction and project management, while Claude 4 Sonnet will be leveraged for specialized tasks like lyrical analysis, mood board descriptions, and script generation. To ensure efficient and cost-effective inference, LocalAI will be used to serve smaller, optimized generative models (e.g., for specific image descriptions or style transfer snippets) on-premise. Hugging Face will manage the deployment, versioning, and inference endpoints for these specialized models, ensuring a robust and scalable architecture. This challenge emphasizes multi-model cooperation, dynamic model routing, and efficient inference management within a creative workflow.
Shared data for this challenge
Review public datasets and any private uploads tied to your build.
How submissions are scored
These dimensions define what the evaluator checks, how much each dimension matters, and which criteria separate a passable run from a strong one.
AllSectionsPresent
Ensures the generated creative brief contains all requested sections (e.g., concept, visuals, mood_board_desc).
This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.
CorrectModelRouting
Verifies that the appropriate model (GPT-5 Pro, Claude 4 Sonnet, LocalAI-served) is selected for at least 80% of test tasks.
This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.
LocalAIIsUsed
Confirms that at least one task within a full creative brief generation successfully utilized a LocalAI-served model.
This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.
CreativeCoherenceScore
An average score representing how well different sections of the brief align conceptually and stylistically. • target: 0.8 • range: 0-1
This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.
ContentDensityScore
Measures the richness and detail provided in key sections of the creative brief. • target: 0.75 • range: 0-1
This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.
AverageInferenceLatency
Average time taken for critical inference steps involving LocalAI-served models. • target: 100 • range: 0-500
This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.
What you should walk away with
Master LangChain's LangGraph for building directed acyclic graph (DAG) workflows that manage state transitions and agent interactions across multi-step creative processes.
Integrate GPT-5 Pro as the 'Creative Director' agent, utilizing its advanced reasoning for conceptualizing overarching themes, target audience, and project goals.
Leverage Claude 4 Sonnet as specialist 'Lyric Analyst' and 'Visual Storyteller' agents, focusing on detailed content generation, emotional tone, and visual cues.
Implement LocalAI to serve task-specific generative models (e.g., image style generators, short text variations) within agent tools for rapid and controlled inference.
Utilize Hugging Face Hub and Endpoints for deploying, versioning, and managing access to various fine-tuned generative models used by LocalAI and other agents.
Design dynamic model routing logic within LangChain to intelligently select between GPT-5 Pro, Claude 4 Sonnet, or LocalAI-served models based on task complexity, cost, and desired output style.
Build custom LangChain tools that encapsulate calls to LocalAI-served models and Hugging Face inference endpoints, enabling agents to use these resources effectively.
[ok] Wrote CHALLENGE.md
[ok] Wrote .versalist.json
[ok] Wrote eval/examples.json
Requires VERSALIST_API_KEY. Works with any MCP-aware editor.
DocsAI Research & Mentorship
Participation status
You haven't started this challenge yet
Operating window
Key dates and the organization behind this challenge.
Find another challenge
Jump to a random challenge when you want a fresh benchmark or a different problem space.