AI Development
Advanced
Always open

Build a Human-AI Collaborative 'Thought Partner'

This challenge focuses on building an advanced human-AI 'Thought Partner' system. Your goal is to create a Agentkit-based multi-agent team designed to assist humans in complex, open-ended tasks (e.g., strategic planning, creative problem-solving, market analysis) by offering deep insights, generating novel ideas, and synthesizing information. The system will feature adaptive thinking budgets to balance real-time responsiveness with exhaustive research. This 'Thought Partner' will employ a team of specialized agents (e.g., a 'Researcher', an 'Analyst', a 'Synthesizer', and a 'Human Interface Agent') orchestrated by Agentkit. The 'Human Interface Agent' (powered by Claude Opus 4.1) will be the primary point of contact, capable of nuanced understanding of human intent and adaptively allocating tasks to internal expert agents. MCP will be used for tool integration with common enterprise systems and knowledge bases, allowing the agents to 'reach out' for data relevant to the human's query.

Status
Always open
Difficulty
Advanced
Points
500
Start the challenge to track prompts, tools, evaluation progress, and leaderboard position in one workspace.
Challenge at a glance
Host and timing
Vera

AI Research & Mentorship

Starts Available now
Evergreen challenge
Challenge brief

What you are building

The core problem, expected build, and operating context for this challenge.

This challenge focuses on building an advanced human-AI 'Thought Partner' system. Your goal is to create a Agentkit-based multi-agent team designed to assist humans in complex, open-ended tasks (e.g., strategic planning, creative problem-solving, market analysis) by offering deep insights, generating novel ideas, and synthesizing information. The system will feature adaptive thinking budgets to balance real-time responsiveness with exhaustive research. This 'Thought Partner' will employ a team of specialized agents (e.g., a 'Researcher', an 'Analyst', a 'Synthesizer', and a 'Human Interface Agent') orchestrated by Agentkit. The 'Human Interface Agent' (powered by Claude Opus 4.1) will be the primary point of contact, capable of nuanced understanding of human intent and adaptively allocating tasks to internal expert agents. MCP will be used for tool integration with common enterprise systems and knowledge bases, allowing the agents to 'reach out' for data relevant to the human's query.

Datasets

Shared data for this challenge

Review public datasets and any private uploads tied to your build.

Loading datasets...
Learning goals

What you should walk away with

Orchestrate Agentkit role-based agent teams with specialized agents (e.g., 'Researcher', 'Analyst', 'Synthesizer', 'Human Interface') for complex problem-solving and task delegation.

Master Claude Opus 4.1's capabilities for nuanced understanding of human intent, complex conversational flow, and adaptive response generation within the 'Human Interface Agent'.

Implement adaptive thinking budgets where agents dynamically adjust their processing time and resource allocation based on task complexity, urgency, and available information.

Design and deploy MCP-enabled tool integration to allow agents to seamlessly interact with enterprise APIs (e.g., CRM, data warehouses, internal knowledge bases, communication platforms).

Develop hybrid instant/deep reasoning systems, enabling the 'Human Interface Agent' to provide quick, high-level answers while simultaneously initiating deeper background research via other agents.

Build a feedback loop mechanism for the human user to provide continuous feedback, allowing the AI 'Thought Partner' to adapt its collaboration style and knowledge over time.

Utilize RAG techniques for the 'Researcher' agent to retrieve and synthesize information from diverse sources, including internal documents and external web searches.

Your progress

Participation status

You haven't started this challenge yet

Timeline and host

Operating window

Key dates and the organization behind this challenge.

Start date
Available now
Run mode
Evergreen challenge
Explore

Find another challenge

Jump to a random challenge when you want a fresh benchmark or a different problem space.

Useful when you want to pressure-test your workflow on a new dataset, new constraints, or a new evaluation rubric.

Tool Space Recipe

Draft
Evaluation

Frequently Asked Questions about Build a Human-AI Collaborative 'Thought Partner'