Agent Building
Advanced
Always open

Cyberthreat Orchestrator Agent

This challenge requires building an autonomous cyber threat detection and remediation system using the LangChain framework, specifically leveraging LangGraph for complex, stateful multi-agent workflows. Developers will design a team of specialized agents that work together to identify threats from simulated log data, analyze their severity, formulate a remediation plan, and orchestrate protective actions. The system must be capable of dynamic decision-making and adapting its response based on the evolving threat landscape. The focus is on robust agent collaboration patterns, sophisticated tool integration, and continuous evaluation of the agent system's effectiveness.

Challenge brief

What you are building

The core problem, expected build, and operating context for this challenge.

This challenge requires building an autonomous cyber threat detection and remediation system using the LangChain framework, specifically leveraging LangGraph for complex, stateful multi-agent workflows. Developers will design a team of specialized agents that work together to identify threats from simulated log data, analyze their severity, formulate a remediation plan, and orchestrate protective actions. The system must be capable of dynamic decision-making and adapting its response based on the evolving threat landscape. The focus is on robust agent collaboration patterns, sophisticated tool integration, and continuous evaluation of the agent system's effectiveness.

Datasets

Shared data for this challenge

Review public datasets and any private uploads tied to your build.

Loading datasets...
Evaluation rubric

How submissions are scored

These dimensions define what the evaluator checks, how much each dimension matters, and which criteria separate a passable run from a strong one.

Max Score: 4
Dimensions
4 scoring checks
Binary
4 pass or fail dimensions
Ordinal
0 scaled dimensions
Dimension 1threataccuracy

ThreatAccuracy

Agent correctly identifies and classifies threats based on provided logs.

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 2planactionability

PlanActionability

Remediation plan contains at least 3 actionable steps for high-severity threats.

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 3plancoherencescore

PlanCoherenceScore

Expert-rated score for the coherence and completeness of the remediation plan (1-5). • target: 4 • range: 1-5

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 4responselatency_ms

ResponseLatency_ms

Time taken for the agent to process a threat and propose a plan in milliseconds. • target: 1000 • range: 0-5000

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Learning goals

What you should walk away with

Master LangGraph for building stateful, cyclic agent workflows, including defining nodes, edges, and conditional routing based on threat context.

Implement robust tool invocation patterns within LangChain agents, enabling interaction with mock security APIs for threat scanning and system lockdown.

Design prompts for Gemini 2.5 Pro to perform sophisticated threat intelligence analysis, identifying zero-day potential and recommending remediation strategies.

Build custom data parsers and aggregators to normalize threat data inputs from various simulated security tools.

Integrate Evidently AI to establish continuous evaluation metrics for agent response time, accuracy of threat identification, and efficacy of proposed remediation steps.

Orchestrate the deployment of specialized Qwen 2-powered sub-agents via Together AI for high-throughput anomaly detection in network traffic logs.

Develop fault-tolerant agent execution strategies within LangGraph to handle partial failures during critical incident response scenarios.

Start from your terminal
$npx -y @versalist/cli start cyberthreat-orchestrator-agent

[ok] Wrote CHALLENGE.md

[ok] Wrote .versalist.json

[ok] Wrote eval/examples.json

Requires VERSALIST_API_KEY. Works with any MCP-aware editor.

Docs
Manage API keys
Challenge at a glance
Host and timing
Vera

AI Research & Mentorship

Starts Available now
Evergreen challenge
Your progress

Participation status

You haven't started this challenge yet

Timeline and host

Operating window

Key dates and the organization behind this challenge.

Start date
Available now
Run mode
Evergreen challenge
Explore

Find another challenge

Jump to a random challenge when you want a fresh benchmark or a different problem space.

Useful when you want to pressure-test your workflow on a new dataset, new constraints, or a new evaluation rubric.

Tool Space Recipe

Draft
Evaluation
Rubric: 4 dimensions
·ThreatAccuracy(1%)
·PlanActionability(1%)
·PlanCoherenceScore(1%)
·ResponseLatency_ms(1%)
Gold items: 2 (2 public)

Frequently Asked Questions about Cyberthreat Orchestrator Agent