Multi-Agent System for Commercial Real Estate Analysis
Develop a sophisticated multi-agent system using AutoGen to act as an expert commercial real estate (CRE) analyst for institutional investors. The system will comprise specialized agents (e.g., a 'Data Fetcher', an 'Economic Analyst', a 'Valuation Specialist', a 'Report Generator') that collaborate autonomously to research, analyze, and synthesize insights on CRE investment opportunities. This challenge emphasizes complex agent-to-agent communication, tool orchestration, and the generation of comprehensive, data-driven reports. The agents should be able to query external (simulated) CRE data APIs, perform financial modeling, and provide reasoned investment recommendations. Focus on designing robust communication protocols between agents to resolve conflicts and refine analyses. The solution must also incorporate an observability framework to trace the multi-agent deliberation process and provide a user-friendly interface for querying the system and visualizing its output.
What you are building
The core problem, expected build, and operating context for this challenge.
Develop a sophisticated multi-agent system using AutoGen to act as an expert commercial real estate (CRE) analyst for institutional investors. The system will comprise specialized agents (e.g., a 'Data Fetcher', an 'Economic Analyst', a 'Valuation Specialist', a 'Report Generator') that collaborate autonomously to research, analyze, and synthesize insights on CRE investment opportunities. This challenge emphasizes complex agent-to-agent communication, tool orchestration, and the generation of comprehensive, data-driven reports. The agents should be able to query external (simulated) CRE data APIs, perform financial modeling, and provide reasoned investment recommendations. Focus on designing robust communication protocols between agents to resolve conflicts and refine analyses. The solution must also incorporate an observability framework to trace the multi-agent deliberation process and provide a user-friendly interface for querying the system and visualizing its output.
Shared data for this challenge
Review public datasets and any private uploads tied to your build.
How submissions are scored
These dimensions define what the evaluator checks, how much each dimension matters, and which criteria separate a passable run from a strong one.
AutoGenAgentsInitialization
Verify all AutoGen agents (UserProxy, Assistant, specialized agents) can be initialized.
This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.
CustomToolRegistration
Confirm custom tools for CRE data access are correctly registered and callable by agents.
This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.
LangSmithIntegration
Ensure LangSmith can successfully trace an AutoGen conversation.
This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.
Report Accuracy Score
Accuracy of key financial figures and investment recommendations in the generated report (0-1). • target: 0.88 • range: 0-1
This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.
Report Coherence and Completeness
Score reflecting the logical flow, clarity, and inclusion of all required report sections (0-1). • target: 0.9 • range: 0-1
This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.
Agent Collaboration Efficiency (Messages)
Number of messages exchanged during a standard analysis task (lower is better). • target: 20 • range: 5-50
This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.
Time to Generate Report (minutes)
End-to-end time from query submission to final report generation. • target: 3 • range: 0-10
This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.
What you should walk away with
Master AutoGen's multi-agent conversational patterns, including agent configuration, role assignment, and inter-agent communication protocols.
Leverage Claude Opus 4.1's advanced reasoning capabilities for complex financial analysis, valuation modeling, and nuanced investment recommendation generation.
Develop custom AutoGen tools for agents to query simulated commercial real estate databases, market data APIs, and economic indicators.
Implement collaborative problem-solving strategies within AutoGen, where agents refine tasks, provide feedback, and reach consensus on investment decisions.
Integrate LangSmith for end-to-end tracing of the multi-agent deliberation, visualizing message flows, tool calls, and LLM interactions for debugging and optimization.
Design data ingestion and preprocessing workflows using Prefect to prepare raw CRE data for agent consumption and analysis.
Build a simple Plotly Dash dashboard for institutional investors to submit queries, view real-time agent activity (via LangSmith), and review generated investment reports.
Formulate sophisticated prompt engineering for Claude Opus 4.1 to ensure structured, detailed, and accurate CRE reports, including executive summaries and risk assessments.
[ok] Wrote CHALLENGE.md
[ok] Wrote .versalist.json
[ok] Wrote eval/examples.json
Requires VERSALIST_API_KEY. Works with any MCP-aware editor.
DocsAI Research & Mentorship
Participation status
You haven't started this challenge yet
Operating window
Key dates and the organization behind this challenge.
Find another challenge
Jump to a random challenge when you want a fresh benchmark or a different problem space.