Workflow Automation
Advanced
Always open

Automate GenAI Data Prep & System Integration

IT industry is adapting to GenAI by focusing on data cleanup and system integration. This challenge aims to build an autonomous multi-agent system using AutoGen, powered by Gemini 3 Pro, to automate complex data preparation and schema mapping for enterprise AI adoption. Agents will collaborate to ingest raw data, perform transformations, reconcile schemas across disparate enterprise systems, and integrate cleaned data using MCP-enabled tools. The system will employ hybrid reasoning, combining Gemini's advanced data understanding with structured data processing tools, to ensure data quality and seamless integration, thereby accelerating enterprise GenAI readiness.

Status
Always open
Difficulty
Advanced
Points
500
Start the challenge to track prompts, tools, evaluation progress, and leaderboard position in one workspace.
Challenge at a glance
Host and timing
Vera

AI Research & Mentorship

Starts Available now
Evergreen challenge
Challenge brief

What you are building

The core problem, expected build, and operating context for this challenge.

IT industry is adapting to GenAI by focusing on data cleanup and system integration. This challenge aims to build an autonomous multi-agent system using AutoGen, powered by Gemini 3 Pro, to automate complex data preparation and schema mapping for enterprise AI adoption. Agents will collaborate to ingest raw data, perform transformations, reconcile schemas across disparate enterprise systems, and integrate cleaned data using MCP-enabled tools. The system will employ hybrid reasoning, combining Gemini's advanced data understanding with structured data processing tools, to ensure data quality and seamless integration, thereby accelerating enterprise GenAI readiness.

Datasets

Shared data for this challenge

Review public datasets and any private uploads tied to your build.

Loading datasets...
Learning goals

What you should walk away with

Orchestrate AutoGen role-based agent teams (e.g., Data Engineer Agent, Schema Mapper Agent, Quality Assurance Agent) for collaborative data pipeline execution

Leverage Gemini 3 Pro's multi-modal capabilities for understanding diverse data formats (e.g., spreadsheets, PDFs, JSON schemas) and generating transformation logic

Implement MCP-enabled tool integration with enterprise data sources (e.g., mock CRM/ERP APIs, SQL databases) for automated data extraction and loading

Build dynamic schema mapping agents that use LlamaIndex for RAG over enterprise documentation and data dictionaries to intelligently reconcile disparate schemas

Develop hybrid reasoning workflows where Gemini 2.5 Pro identifies data quality issues and generates Python scripts (via its code generation capabilities) for rectification, executed by a Code Executor Agent

Design graph-based data lineage tracking and transformation workflows to visualize data flow and dependencies across agents and systems

Implement self-correcting mechanisms where agents can detect and resolve data inconsistencies or integration failures autonomously

Your progress

Participation status

You haven't started this challenge yet

Timeline and host

Operating window

Key dates and the organization behind this challenge.

Start date
Available now
Run mode
Evergreen challenge
Explore

Find another challenge

Jump to a random challenge when you want a fresh benchmark or a different problem space.

Useful when you want to pressure-test your workflow on a new dataset, new constraints, or a new evaluation rubric.

Tool Space Recipe

Draft
Evaluation

Frequently Asked Questions about Automate GenAI Data Prep & System Integration