Agent Building
Advanced
Always open

Multi-Agent AI for Dynamic Short-Form Video Generation

Design and implement a multi-agent system using AutoGen that simulates a creative studio for generating short-form video content (e.g., YouTube Shorts). Inspired by YouTube's vision for AI-generated media, this system will coordinate specialized agents like a 'Scriptwriter Agent,' 'Visuals Generator Agent,' and 'Editor Agent' to autonomously conceptualize, script, and outline visual elements for a short video based on a user's prompt. The challenge emphasizes complex multi-agent conversations, conditional workflows, and the integration of diverse AI models for different generation tasks, with human-in-the-loop validation at critical stages. This system should be capable of producing a detailed production plan and asset descriptions, even if it doesn't render the final video.

Challenge brief

What you are building

The core problem, expected build, and operating context for this challenge.

Design and implement a multi-agent system using AutoGen that simulates a creative studio for generating short-form video content (e.g., YouTube Shorts). Inspired by YouTube's vision for AI-generated media, this system will coordinate specialized agents like a 'Scriptwriter Agent,' 'Visuals Generator Agent,' and 'Editor Agent' to autonomously conceptualize, script, and outline visual elements for a short video based on a user's prompt. The challenge emphasizes complex multi-agent conversations, conditional workflows, and the integration of diverse AI models for different generation tasks, with human-in-the-loop validation at critical stages. This system should be capable of producing a detailed production plan and asset descriptions, even if it doesn't render the final video.

Datasets

Shared data for this challenge

Review public datasets and any private uploads tied to your build.

Loading datasets...
Evaluation rubric

How submissions are scored

These dimensions define what the evaluator checks, how much each dimension matters, and which criteria separate a passable run from a strong one.

Max Score: 7
Dimensions
7 scoring checks
Binary
7 pass or fail dimensions
Ordinal
0 scaled dimensions
Dimension 1plancompleteness

PlanCompleteness

All required components of the production plan (title, logline, script scenes, asset list) are present.

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 2promptadherence

PromptAdherence

Generated plan aligns with the user's initial prompt and constraints (audience, length, theme).

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 3agentcommunicationflow

AgentCommunicationFlow

Agents engage in a logical and coherent conversation to reach the content generation goal.

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 4humanfeedbackintegration

HumanFeedbackIntegration

Human feedback is correctly processed and results in appropriate, relevant plan revisions.

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 5creativity_score

Creativity_Score

Subjective score for the originality and entertainment value of the generated plan (1-5, 5 being highly creative). • target: 4 • range: 1-5

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 6visualfeasibility_score

VisualFeasibility_Score

How realistic and detailed the visual descriptions are for actual video production (1-5, 5 being highly feasible). • target: 4 • range: 1-5

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 7iteration_efficiency

Iteration_Efficiency

Number of turns required for agents to effectively address human feedback (fewer is better, target around 2-3 turns). • target: 2 • range: 1-5

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Learning goals

What you should walk away with

Master AutoGen's ConversableAgent and UserProxyAgent to create complex multi-agent conversations.

Implement advanced agent roles using Claude Opus 4.1 for creative conceptualization and script generation.

Leverage OpenAI o3 for specialized tasks such as generating visual asset descriptions or catchy titles.

Design and integrate custom tools for agents to simulate external media generation services.

Implement dynamic workflow branching in AutoGen based on agent outputs or human feedback.

Explore how DeepSpeed can optimize fine-tuned models for specific content generation sub-tasks within the agent workflow.

Orchestrate output storage of generated plans and assets using Azure Blob Storage.

Start from your terminal
$npx -y @versalist/cli start multi-agent-ai-for-dynamic-short-form-video-generation

[ok] Wrote CHALLENGE.md

[ok] Wrote .versalist.json

[ok] Wrote eval/examples.json

Requires VERSALIST_API_KEY. Works with any MCP-aware editor.

Docs
Manage API keys
Challenge at a glance
Host and timing
Vera

AI Research & Mentorship

Starts Available now
Evergreen challenge
Your progress

Participation status

You haven't started this challenge yet

Timeline and host

Operating window

Key dates and the organization behind this challenge.

Start date
Available now
Run mode
Evergreen challenge
Explore

Find another challenge

Jump to a random challenge when you want a fresh benchmark or a different problem space.

Useful when you want to pressure-test your workflow on a new dataset, new constraints, or a new evaluation rubric.

Tool Space Recipe

Draft
Evaluation
Rubric: 7 dimensions
·PlanCompleteness(1%)
·PromptAdherence(1%)
·AgentCommunicationFlow(1%)
·HumanFeedbackIntegration(1%)
·Creativity_Score(1%)
·VisualFeasibility_Score(1%)
·Iteration_Efficiency(1%)
Gold items: 2 (2 public)

Frequently Asked Questions about Multi-Agent AI for Dynamic Short-Form Video Generation