planning

Google ADK Agent Definition for Multimodal Content

Inspect the original prompt language first, then copy or adapt it once you know how it fits your workflow.

Linked challenge: Multimodal Content Generation Agent for AI Video Platform

Format

Text-first

Lines

Sections

Linked challenge

Multimodal Content Generation Agent for AI Video Platform

Prompt source

Original prompt text with formatting preserved for inspection.

1 lines

1 sections

No variables

0 checklist items

Define the core structure of your Google ADK agent for multimodal content generation. Outline its primary objective, the types of inputs it will accept (e.g., text for trends, image/video for inspiration), and the expected outputs (e.g., video concepts, scripts). Describe how Gemini 2.5 Pro will be used for multimodal reasoning and creative text generation. Provide an initial Python snippet showing how to initialize the ADK client and potentially a basic agent loop.

Adaptation plan

Keep the source stable, then change the prompt in a predictable order so the next run is easier to evaluate.

Keep stable

Preserve the role framing, objective, and reporting structure so comparison runs stay coherent.

Tune next

Swap in your own domain constraints, anomaly thresholds, and examples before you branch variants.

Verify after

Check whether the prompt asks for the right evidence, confidence signal, and escalation path.

Prompt diagnostics

Variables

Lists

Code blocks

Purpose

planning

This prompt is mostly narrative and instruction-driven, so adapt examples and output constraints before you rewrite the structure.

Linked challenge

Multimodal Content Generation Agent for AI Video Platform

Design and build an advanced multimodal agent using Google ADK and Gemini 3 Pro that specializes in generating creative content ideas, scripts, and visual concepts for short-form AI video platforms, similar to Meta's 'Vibes'. The agent should analyze current trends (e.g., popular memes, news topics, user preferences stored in a vector database) and generate novel, engaging video concepts. It should be capable of orchestrating calls to external tools like Stable Diffusion XL for generating visual mood boards or Triton Inference Server for specialized video analysis models. The challenge emphasizes multimodal reasoning, creative generation, and robust workflow orchestration.

Open challenge

Related prompts

Browse library

Implement Trend Analysis with Weaviate

implementation

Integrate Stable Diffusion XL and Prefect for Workflow

implementation

Deployment Strategy for Triton Inference Server

deployment