Autonomous Market Strategy Team
This challenge involves constructing an autonomous, multi-agent market strategy team using the CrewAI framework. The team will be tasked with analyzing the competitive landscape for emerging generative AI technologies, such as new video generation models or large language models like DeepSeek. The goal is to provide strategic recommendations for businesses looking to innovate or compete in this rapidly evolving sector. The core intelligence of the agent team will be driven by DeepSeek R1, selected for its advanced reasoning capabilities and cost-efficiency, dynamically routed via OpenRouter for optimal performance across various analytical tasks. Crucially, the agents will leverage MemGPT for long-term memory, allowing them to recall past analyses, adapt to evolving market trends, and maintain conversational context over extended interactions. This enables more sophisticated and consistent strategic planning. To support the development and interaction with this complex system, GitHub Copilot will serve as an AI engineering assistant for creating custom tools and refining agent logic, while Bito AI will provide a chat interface for users to query the team, review reports, and provide feedback, fostering a human-in-the-loop oversight.
What you are building
The core problem, expected build, and operating context for this challenge.
This challenge involves constructing an autonomous, multi-agent market strategy team using the CrewAI framework. The team will be tasked with analyzing the competitive landscape for emerging generative AI technologies, such as new video generation models or large language models like DeepSeek. The goal is to provide strategic recommendations for businesses looking to innovate or compete in this rapidly evolving sector. The core intelligence of the agent team will be driven by DeepSeek R1, selected for its advanced reasoning capabilities and cost-efficiency, dynamically routed via OpenRouter for optimal performance across various analytical tasks. Crucially, the agents will leverage MemGPT for long-term memory, allowing them to recall past analyses, adapt to evolving market trends, and maintain conversational context over extended interactions. This enables more sophisticated and consistent strategic planning. To support the development and interaction with this complex system, GitHub Copilot will serve as an AI engineering assistant for creating custom tools and refining agent logic, while Bito AI will provide a chat interface for users to query the team, review reports, and provide feedback, fostering a human-in-the-loop oversight.
Shared data for this challenge
Review public datasets and any private uploads tied to your build.
How submissions are scored
These dimensions define what the evaluator checks, how much each dimension matters, and which criteria separate a passable run from a strong one.
Report Completeness
Ensure all required sections (summary, findings, recommendations) are present in the market strategy report.
This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.
Contextual Recall Accuracy
Verify that the follow-up query response correctly leverages information from the initial query.
This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.
Tool Invocation Success
Confirm that all necessary custom tools were successfully invoked by agents during tasks.
This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.
Strategic Recommendation Quality
Manual rating (1-5) of the actionability and creativity of strategic recommendations. • target: 4 • range: 1-5
This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.
DeepSeek R1 Utilization Rate
Percentage of complex analytical tasks processed by DeepSeek R1 via OpenRouter. • target: 0.98 • range: 0.9-1
This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.
Agent Collaboration Score
Metric (0-1) reflecting efficient and non-redundant task distribution among agents. • target: 0.85 • range: 0.7-1
This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.
What you should walk away with
Master CrewAI framework for defining agent roles, tasks, and processes, enabling seamless collaboration for strategic analysis.
Implement MemGPT for agents to maintain long-term memory, allowing recall of prior market analyses, company profiles, and strategic discussions.
Design and integrate custom tools for CrewAI agents to perform market research, competitor analysis, financial modeling, and report generation.
Leverage OpenRouter to dynamically select and route requests to DeepSeek R1 or other models based on task complexity, cost, or specific capabilities.
Build a conversational interface using Bito AI to allow stakeholders to interact with the CrewAI team, pose questions, and receive strategic reports.
Utilize GitHub Copilot as an AI pair programmer to accelerate the development of agent tools, task definitions, and complex Python logic within the CrewAI setup.
Develop robust evaluation criteria for the CrewAI team's generated market strategies, focusing on accuracy, completeness, and actionable insights.
[ok] Wrote CHALLENGE.md
[ok] Wrote .versalist.json
[ok] Wrote eval/examples.json
Requires VERSALIST_API_KEY. Works with any MCP-aware editor.
DocsAI Research & Mentorship
Participation status
You haven't started this challenge yet
Operating window
Key dates and the organization behind this challenge.
Find another challenge
Jump to a random challenge when you want a fresh benchmark or a different problem space.