Agent Building
Advanced
Always open

Build MCP-Enabled Social Media Policy Enforcement Agents

Companies are increasing social media monitoring due to reputational risks from employee posts. This challenge involves building an advanced multi-agent system that autonomously monitors public social media feeds for potential employee policy violations. The system will leverage a graph-based workflow to analyze content, interpret company policies (retrieved via RAG), and flag potential issues, while adapting its reasoning budget based on the sensitivity of the content or the severity of the potential violation. You will design and implement a sophisticated LangGraph-based agent network where a 'Monitor Agent' feeds data to a 'Policy Interpretation Agent' and a 'Risk Assessment Agent'. These agents will communicate using a defined protocol, integrating with external enterprise policy databases via MCP for real-time policy lookup and dynamically adjusting their processing depth (thinking budget) to balance efficiency and accuracy. The system must provide actionable insights and prioritize alerts for human review, ensuring compliance while minimizing false positives.

Status
Always open
Difficulty
Advanced
Points
500
Start the challenge to track prompts, tools, evaluation progress, and leaderboard position in one workspace.
Challenge at a glance
Host and timing
Vera

AI Research & Mentorship

Starts Available now
Evergreen challenge
Challenge brief

What you are building

The core problem, expected build, and operating context for this challenge.

Companies are increasing social media monitoring due to reputational risks from employee posts. This challenge involves building an advanced multi-agent system that autonomously monitors public social media feeds for potential employee policy violations. The system will leverage a graph-based workflow to analyze content, interpret company policies (retrieved via RAG), and flag potential issues, while adapting its reasoning budget based on the sensitivity of the content or the severity of the potential violation. You will design and implement a sophisticated LangGraph-based agent network where a 'Monitor Agent' feeds data to a 'Policy Interpretation Agent' and a 'Risk Assessment Agent'. These agents will communicate using a defined protocol, integrating with external enterprise policy databases via MCP for real-time policy lookup and dynamically adjusting their processing depth (thinking budget) to balance efficiency and accuracy. The system must provide actionable insights and prioritize alerts for human review, ensuring compliance while minimizing false positives.

Datasets

Shared data for this challenge

Review public datasets and any private uploads tied to your build.

Loading datasets...
Learning goals

What you should walk away with

Master LangGraph for building complex, stateful Directed Acyclic Graph (DAG) workflows for multi-agent systems, including node and edge definitions, and state management.

Implement MCP-enabled tool integration using Claude Opus 4.5 to securely connect agents to simulated enterprise HR policy databases and social media APIs.

Design and deploy 'Monitor', 'Policy Interpretation', and 'Risk Assessment' agents using Claude Opus 4.1 and Mistral Nemo, defining their roles, responsibilities, and communication protocols.

Build extended thinking pipelines where agents dynamically adjust their reasoning budget based on the input's complexity or the detected risk level, optimizing for both speed and accuracy.

Integrate LlamaIndex for robust RAG over diverse policy documents, allowing agents to retrieve and synthesize relevant policy clauses in real-time for nuanced interpretation.

Orchestrate the flow of information and decisions across agents within LangGraph, ensuring effective handoffs and collaborative problem-solving for policy compliance.

Your progress

Participation status

You haven't started this challenge yet

Timeline and host

Operating window

Key dates and the organization behind this challenge.

Start date
Available now
Run mode
Evergreen challenge
Explore

Find another challenge

Jump to a random challenge when you want a fresh benchmark or a different problem space.

Useful when you want to pressure-test your workflow on a new dataset, new constraints, or a new evaluation rubric.

Tool Space Recipe

Draft
Evaluation

Frequently Asked Questions about Build MCP-Enabled Social Media Policy Enforcement Agents