Assistants & Interfaces
Advanced
Always open

AI-Assisted Flight Operations Agent

This challenge focuses on building an AI-powered flight assistant using the Vercel AI SDK. This assistant will provide real-time guidance, perform safety checks, and assist with complex operational procedures through a conversational, voice-enabled interface. You will integrate Claude Sonnet 4 for robust reasoning and Llama 3 for quick, localized responses (via Hugging Face Inference Endpoints). The system will leverage a low-code automation platform like Hyperbolic for connecting to simulated flight control systems and OpenTelemetry for robust observability of AI interactions and system state, ensuring safety and compliance in critical aerospace operations.

Challenge brief

What you are building

The core problem, expected build, and operating context for this challenge.

This challenge focuses on building an AI-powered flight assistant using the Vercel AI SDK. This assistant will provide real-time guidance, perform safety checks, and assist with complex operational procedures through a conversational, voice-enabled interface. You will integrate Claude Sonnet 4 for robust reasoning and Llama 3 for quick, localized responses (via Hugging Face Inference Endpoints). The system will leverage a low-code automation platform like Hyperbolic for connecting to simulated flight control systems and OpenTelemetry for robust observability of AI interactions and system state, ensuring safety and compliance in critical aerospace operations.

Datasets

Shared data for this challenge

Review public datasets and any private uploads tied to your build.

Loading datasets...
Evaluation rubric

How submissions are scored

These dimensions define what the evaluator checks, how much each dimension matters, and which criteria separate a passable run from a strong one.

Max Score: 4
Dimensions
4 scoring checks
Binary
4 pass or fail dimensions
Ordinal
0 scaled dimensions
Dimension 1correcttoolexecution

CorrectToolExecution

The assistant must execute the correct sequence of tools based on the voice command and flight state.

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 2contextualresponseaccuracy

ContextualResponseAccuracy

The 'agent_response' must be contextually appropriate and informative based on the simulated actions.

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 3responselatency

ResponseLatency

Time taken from voice command processing to agent response, lower is better. • target: 300 • range: 0-1000

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 4observabilitycompleteness

ObservabilityCompleteness

Score based on the completeness and correctness of OpenTelemetry traces for the interaction, higher is better. • target: 90 • range: 0-100

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Learning goals

What you should walk away with

Master the Vercel AI SDK for building streaming, tool-using conversational interfaces in TypeScript/JavaScript.

Implement voice input and output using Fixie (or a similar Web Speech API integration) to create a natural, hands-free interaction experience.

Design intelligent tool calls within the AI SDK for interacting with a simulated flight control system (e.g., 'check_fuel_level', 'initiate_autopilot_sequence').

Integrate Claude Sonnet 4 for critical reasoning tasks and complex procedure interpretation, leveraging its strong safety and reliability features.

Utilize Llama 3 via Hugging Face Inference Endpoints for quicker, context-specific responses or simpler control commands.

Connect the AI SDK application to a simulated flight system via Hyperbolic (or a mock API gateway) for triggering external actions and fetching real-time data.

Implement OpenTelemetry tracing and logging within the AI SDK application to monitor user interactions, agent decisions, and tool executions for auditing and debugging flight-critical operations.

Start from your terminal
$npx -y @versalist/cli start ai-assisted-flight-operations-agent

[ok] Wrote CHALLENGE.md

[ok] Wrote .versalist.json

[ok] Wrote eval/examples.json

Requires VERSALIST_API_KEY. Works with any MCP-aware editor.

Docs
Manage API keys
Challenge at a glance
Host and timing
Vera

AI Research & Mentorship

Starts Available now
Evergreen challenge
Your progress

Participation status

You haven't started this challenge yet

Timeline and host

Operating window

Key dates and the organization behind this challenge.

Start date
Available now
Run mode
Evergreen challenge
Explore

Find another challenge

Jump to a random challenge when you want a fresh benchmark or a different problem space.

Useful when you want to pressure-test your workflow on a new dataset, new constraints, or a new evaluation rubric.

Tool Space Recipe

Draft
Evaluation
Rubric: 4 dimensions
·CorrectToolExecution(1%)
·ContextualResponseAccuracy(1%)
·ResponseLatency(1%)
·ObservabilityCompleteness(1%)
Gold items: 1 (1 public)

Frequently Asked Questions about AI-Assisted Flight Operations Agent