Assistants & Interfaces
Advanced
Always open

Gemini-powered Voice Navigator Agent

Develop a hands-free, multimodal conversational agent using Google's Agent Development Kit (ADK) that integrates with Google Maps for real-time navigational assistance. The agent should leverage Gemini's multimodal capabilities to understand voice commands, provide spoken directions, and offer context-aware information based on the user's location and activity (e.g., walking, cycling). This challenge focuses on building robust, real-time voice interfaces that seamlessly integrate generative AI with location-based services, prioritizing safety and natural interaction.

Challenge brief

What you are building

The core problem, expected build, and operating context for this challenge.

Develop a hands-free, multimodal conversational agent using Google's Agent Development Kit (ADK) that integrates with Google Maps for real-time navigational assistance. The agent should leverage Gemini's multimodal capabilities to understand voice commands, provide spoken directions, and offer context-aware information based on the user's location and activity (e.g., walking, cycling). This challenge focuses on building robust, real-time voice interfaces that seamlessly integrate generative AI with location-based services, prioritizing safety and natural interaction.

Datasets

Shared data for this challenge

Review public datasets and any private uploads tied to your build.

Loading datasets...
Evaluation rubric

How submissions are scored

These dimensions define what the evaluator checks, how much each dimension matters, and which criteria separate a passable run from a strong one.

Max Score: 4
Dimensions
4 scoring checks
Binary
4 pass or fail dimensions
Ordinal
0 scaled dimensions
Dimension 1correcttoolinvocation

CorrectToolInvocation

Verifies that the agent correctly invokes relevant Google Maps APIs for navigation and context.

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 2contextualrelevance

ContextualRelevance

Checks if the agent's response is relevant to the user's current location and activity.

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 3responselatencyms

ResponseLatencyMs

Average time taken for the agent to generate a response. • target: 800 • range: 0-2000

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 4conversationalfluencyscore

ConversationalFluencyScore

A subjective score (1-5) on how natural and helpful the conversation feels. • target: 4 • range: 1-5

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Learning goals

What you should walk away with

Master Google ADK for orchestrating agent workflows, managing state, and integrating tools with Gemini.

Implement real-time voice input and output using Google Cloud Speech-to-Text and Text-to-Speech APIs.

Utilize Gemini 1.5 Pro's multimodal capabilities to process visual cues (simulated) and generate contextually rich responses.

Integrate with Google Maps Platform APIs to fetch real-time location, route, and point-of-interest data.

Design safety-critical conversational flows for cyclists and pedestrians, including hazard warnings and emergency assistance.

Deploy and manage the ADK agent on Google Cloud Vertex AI, ensuring scalability and low-latency inference.

Start from your terminal
$npx -y @versalist/cli start gemini-powered-voice-navigator-agent

[ok] Wrote CHALLENGE.md

[ok] Wrote .versalist.json

[ok] Wrote eval/examples.json

Requires VERSALIST_API_KEY. Works with any MCP-aware editor.

Docs
Manage API keys
Challenge at a glance
Host and timing
Vera

AI Research & Mentorship

Starts Available now
Evergreen challenge
Your progress

Participation status

You haven't started this challenge yet

Timeline and host

Operating window

Key dates and the organization behind this challenge.

Start date
Available now
Run mode
Evergreen challenge
Explore

Find another challenge

Jump to a random challenge when you want a fresh benchmark or a different problem space.

Useful when you want to pressure-test your workflow on a new dataset, new constraints, or a new evaluation rubric.

Tool Space Recipe

Draft
Evaluation
Rubric: 4 dimensions
·CorrectToolInvocation(1%)
·ContextualRelevance(1%)
·ResponseLatencyMs(1%)
·ConversationalFluencyScore(1%)
Gold items: 2 (2 public)

Frequently Asked Questions about Gemini-powered Voice Navigator Agent