Multimodal Content Generator for Brand Safety
Create a Google ADK agent that generates innovative multimodal content concepts (e.g., short video scripts, visual descriptions, audio cues) tailored for specific platforms like YouTube or social media. The agent must meticulously adhere to brand safety guidelines and platform content policies. Leveraging Gemini's multimodal capabilities, it will perform self-correction, using external tools like Skyvern to scrape real-time policy updates and Voiceflow for a natural, conversational user interface. This challenge focuses on delivering creative content while ensuring strict compliance.
What you are building
The core problem, expected build, and operating context for this challenge.
Create a Google ADK agent that generates innovative multimodal content concepts (e.g., short video scripts, visual descriptions, audio cues) tailored for specific platforms like YouTube or social media. The agent must meticulously adhere to brand safety guidelines and platform content policies. Leveraging Gemini's multimodal capabilities, it will perform self-correction, using external tools like Skyvern to scrape real-time policy updates and Voiceflow for a natural, conversational user interface. This challenge focuses on delivering creative content while ensuring strict compliance.
Shared data for this challenge
Review public datasets and any private uploads tied to your build.
What you should walk away with
Master Google ADK for building robust, deployable AI agents that seamlessly integrate with Gemini 2.5 Pro's native multimodal capabilities for generating text, visual descriptions, and audio prompts.
Implement structured output generation within the ADK agent to consistently produce multimodal content concepts in a defined format (e.g., JSON schema for script, visual direction, sound design).
Design a 'Policy Scrutinizer' tool/capability within the ADK agent that uses Skyvern to programmatically access, extract, and analyze real-time content guidelines from a (simulated) platform like YouTube or the BBC iPlayer.
Integrate Voiceflow to create a natural language conversational interface, allowing users to interact with the ADK agent to request content ideas, specify themes, and receive generated multimodal concepts.
Develop sophisticated self-correction mechanisms within the ADK agent, enabling it to refine generated content iteratively based on feedback from brand safety checks and platform policy violations identified by the 'Policy Scrutinizer'.
Utilize Vertex AI's MLOps features for deploying and monitoring the ADK agent's performance, ensuring high availability, and managing safety guardrails effectively.
Build a comprehensive testing suite to evaluate the agent's creativity, adherence to specific content requirements, and strict compliance with brand safety and platform policies.
[ok] Wrote CHALLENGE.md
[ok] Wrote .versalist.json
[ok] Wrote eval/examples.json
Requires VERSALIST_API_KEY. Works with any MCP-aware editor.
DocsAI Research & Mentorship
Participation status
You haven't started this challenge yet
Operating window
Key dates and the organization behind this challenge.
Find another challenge
Jump to a random challenge when you want a fresh benchmark or a different problem space.