Agent Building
Advanced
Always open

Autonomous Market Strategy Team

This challenge involves constructing an autonomous, multi-agent market strategy team using the CrewAI framework. The team will be tasked with analyzing the competitive landscape for emerging generative AI technologies, such as new video generation models or large language models like DeepSeek. The goal is to provide strategic recommendations for businesses looking to innovate or compete in this rapidly evolving sector. The core intelligence of the agent team will be driven by DeepSeek R1, selected for its advanced reasoning capabilities and cost-efficiency, dynamically routed via OpenRouter for optimal performance across various analytical tasks. Crucially, the agents will leverage MemGPT for long-term memory, allowing them to recall past analyses, adapt to evolving market trends, and maintain conversational context over extended interactions. This enables more sophisticated and consistent strategic planning. To support the development and interaction with this complex system, GitHub Copilot will serve as an AI engineering assistant for creating custom tools and refining agent logic, while Bito AI will provide a chat interface for users to query the team, review reports, and provide feedback, fostering a human-in-the-loop oversight.

Challenge brief

What you are building

The core problem, expected build, and operating context for this challenge.

This challenge involves constructing an autonomous, multi-agent market strategy team using the CrewAI framework. The team will be tasked with analyzing the competitive landscape for emerging generative AI technologies, such as new video generation models or large language models like DeepSeek. The goal is to provide strategic recommendations for businesses looking to innovate or compete in this rapidly evolving sector. The core intelligence of the agent team will be driven by DeepSeek R1, selected for its advanced reasoning capabilities and cost-efficiency, dynamically routed via OpenRouter for optimal performance across various analytical tasks. Crucially, the agents will leverage MemGPT for long-term memory, allowing them to recall past analyses, adapt to evolving market trends, and maintain conversational context over extended interactions. This enables more sophisticated and consistent strategic planning. To support the development and interaction with this complex system, GitHub Copilot will serve as an AI engineering assistant for creating custom tools and refining agent logic, while Bito AI will provide a chat interface for users to query the team, review reports, and provide feedback, fostering a human-in-the-loop oversight.

Datasets

Shared data for this challenge

Review public datasets and any private uploads tied to your build.

Loading datasets...
Evaluation rubric

How submissions are scored

These dimensions define what the evaluator checks, how much each dimension matters, and which criteria separate a passable run from a strong one.

Max Score: 6
Dimensions
6 scoring checks
Binary
6 pass or fail dimensions
Ordinal
0 scaled dimensions
Dimension 1report_completeness

Report Completeness

Ensure all required sections (summary, findings, recommendations) are present in the market strategy report.

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 2contextual_recall_accuracy

Contextual Recall Accuracy

Verify that the follow-up query response correctly leverages information from the initial query.

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 3tool_invocation_success

Tool Invocation Success

Confirm that all necessary custom tools were successfully invoked by agents during tasks.

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 4strategic_recommendation_quality

Strategic Recommendation Quality

Manual rating (1-5) of the actionability and creativity of strategic recommendations. • target: 4 • range: 1-5

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 5deepseek_r1_utilization_rate

DeepSeek R1 Utilization Rate

Percentage of complex analytical tasks processed by DeepSeek R1 via OpenRouter. • target: 0.98 • range: 0.9-1

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 6agent_collaboration_score

Agent Collaboration Score

Metric (0-1) reflecting efficient and non-redundant task distribution among agents. • target: 0.85 • range: 0.7-1

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Learning goals

What you should walk away with

Master CrewAI framework for defining agent roles, tasks, and processes, enabling seamless collaboration for strategic analysis.

Implement MemGPT for agents to maintain long-term memory, allowing recall of prior market analyses, company profiles, and strategic discussions.

Design and integrate custom tools for CrewAI agents to perform market research, competitor analysis, financial modeling, and report generation.

Leverage OpenRouter to dynamically select and route requests to DeepSeek R1 or other models based on task complexity, cost, or specific capabilities.

Build a conversational interface using Bito AI to allow stakeholders to interact with the CrewAI team, pose questions, and receive strategic reports.

Utilize GitHub Copilot as an AI pair programmer to accelerate the development of agent tools, task definitions, and complex Python logic within the CrewAI setup.

Develop robust evaluation criteria for the CrewAI team's generated market strategies, focusing on accuracy, completeness, and actionable insights.

Start from your terminal
$npx -y @versalist/cli start autonomous-market-strategy-team

[ok] Wrote CHALLENGE.md

[ok] Wrote .versalist.json

[ok] Wrote eval/examples.json

Requires VERSALIST_API_KEY. Works with any MCP-aware editor.

Docs
Manage API keys
Challenge at a glance
Host and timing
Vera

AI Research & Mentorship

Starts Available now
Evergreen challenge
Your progress

Participation status

You haven't started this challenge yet

Timeline and host

Operating window

Key dates and the organization behind this challenge.

Start date
Available now
Run mode
Evergreen challenge
Explore

Find another challenge

Jump to a random challenge when you want a fresh benchmark or a different problem space.

Useful when you want to pressure-test your workflow on a new dataset, new constraints, or a new evaluation rubric.

Tool Space Recipe

Draft
Evaluation
Rubric: 6 dimensions
·Report Completeness(1%)
·Contextual Recall Accuracy(1%)
·Tool Invocation Success(1%)
·Strategic Recommendation Quality(1%)
·DeepSeek R1 Utilization Rate(1%)
·Agent Collaboration Score(1%)
Gold items: 2 (2 public)

Frequently Asked Questions about Autonomous Market Strategy Team