Agent Building
Advanced
Always open

Multi-Agent System for Internal Security Anomaly Detection

This challenge focuses on building a sophisticated multi-agent system using AutoGen to detect potential data leaks or anomalous behavior. Participants will design and implement a collaborative team of AI agents capable of monitoring internal communication logs, system access records, and cross-referencing this data with external news feeds or public information. The system will identify patterns and anomalies that might indicate security incidents or insider threats. The core of the challenge involves orchestrating diverse agents, each with specific roles like 'Log Monitor', 'News Analyst', 'Incident Investigator', and 'Reporting Agent'. These agents will communicate and collaborate autonomously, using o4-mini for reasoning and specific tools to interact with simulated data sources. The goal is to build an intelligent, proactive security monitoring system that can identify subtle indicators of risk and present a consolidated, actionable report.

Challenge brief

What you are building

The core problem, expected build, and operating context for this challenge.

This challenge focuses on building a sophisticated multi-agent system using AutoGen to detect potential data leaks or anomalous behavior. Participants will design and implement a collaborative team of AI agents capable of monitoring internal communication logs, system access records, and cross-referencing this data with external news feeds or public information. The system will identify patterns and anomalies that might indicate security incidents or insider threats. The core of the challenge involves orchestrating diverse agents, each with specific roles like 'Log Monitor', 'News Analyst', 'Incident Investigator', and 'Reporting Agent'. These agents will communicate and collaborate autonomously, using o4-mini for reasoning and specific tools to interact with simulated data sources. The goal is to build an intelligent, proactive security monitoring system that can identify subtle indicators of risk and present a consolidated, actionable report.

Datasets

Shared data for this challenge

Review public datasets and any private uploads tied to your build.

Loading datasets...
Evaluation rubric

How submissions are scored

These dimensions define what the evaluator checks, how much each dimension matters, and which criteria separate a passable run from a strong one.

Max Score: 3
Dimensions
3 scoring checks
Binary
3 pass or fail dimensions
Ordinal
0 scaled dimensions
Dimension 1correct_anomaly_identification

Correct Anomaly Identification

The 'anomaly_detected' flag must be true for positive cases and false for negative cases.

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 2report_clarity

Report Clarity

The 'report_summary' must clearly describe the anomaly and contributing factors.

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 3confidence_score

Confidence Score

The reported confidence in the anomaly detection. • target: 0.8 • range: 0-1

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Learning goals

What you should walk away with

Master AutoGen for building complex, conversational multi-agent systems with shared context and human-in-the-loop capabilities.

Implement role-based agent collaboration patterns in AutoGen, defining specialized agents like Log Monitor, News Analyst, and Security Investigator.

Integrate o4-mini models into AutoGen agents for advanced reasoning, natural language processing, and pattern recognition tasks.

Design and implement custom tools for AutoGen agents to interact with simulated internal access logs, email archives, and external news APIs.

Utilize FLAML within AutoGen workflows for automated hyperparameter tuning and efficient resource management for agent-based tasks.

Develop reporting mechanisms using All Hands AI for summarizing security incidents and communicating findings to human operators.

Apply CodeRabbit principles for ensuring code quality and best practices in the AutoGen agent codebase, emphasizing maintainability and security.

Explore Neurolink patterns for designing resilient and adaptive agent systems capable of handling dynamic security threat landscapes.

Start from your terminal
$npx -y @versalist/cli start multi-agent-system-for-internal-security-anomaly-detection

[ok] Wrote CHALLENGE.md

[ok] Wrote .versalist.json

[ok] Wrote eval/examples.json

Requires VERSALIST_API_KEY. Works with any MCP-aware editor.

Docs
Manage API keys
Challenge at a glance
Host and timing
Vera

AI Research & Mentorship

Starts Available now
Evergreen challenge
Your progress

Participation status

You haven't started this challenge yet

Timeline and host

Operating window

Key dates and the organization behind this challenge.

Start date
Available now
Run mode
Evergreen challenge
Explore

Find another challenge

Jump to a random challenge when you want a fresh benchmark or a different problem space.

Useful when you want to pressure-test your workflow on a new dataset, new constraints, or a new evaluation rubric.

Tool Space Recipe

Draft
Evaluation
Rubric: 3 dimensions
·Correct Anomaly Identification(1%)
·Report Clarity(1%)
·Confidence Score(1%)
Gold items: 1 (1 public)

Frequently Asked Questions about Multi-Agent System for Internal Security Anomaly Detection