AI Development
Advanced
Always open

Agent for Ethical, Personalized Content Recommendation

Build an ethical, personalized content summarizer and recommender system for digital subscribers using Pydantic AI. The challenge focuses on leveraging Pydantic AI's capabilities for structured output generation and schema validation to ensure summaries are factual, recommendations are relevant, and both adhere strictly to ethical guidelines (e.g., 'ad-free', 'no advertiser influence'). You will design agents that interact with a content ingestion pipeline, generate structured summaries of articles, and provide personalized recommendations based on user profiles stored in a vector database. The system must guarantee that all generated content and recommendations conform to predefined Pydantic models, reflecting Anthropic's commitment to unbiased content.

Challenge brief

What you are building

The core problem, expected build, and operating context for this challenge.

Build an ethical, personalized content summarizer and recommender system for digital subscribers using Pydantic AI. The challenge focuses on leveraging Pydantic AI's capabilities for structured output generation and schema validation to ensure summaries are factual, recommendations are relevant, and both adhere strictly to ethical guidelines (e.g., 'ad-free', 'no advertiser influence'). You will design agents that interact with a content ingestion pipeline, generate structured summaries of articles, and provide personalized recommendations based on user profiles stored in a vector database. The system must guarantee that all generated content and recommendations conform to predefined Pydantic models, reflecting Anthropic's commitment to unbiased content.

Datasets

Shared data for this challenge

Review public datasets and any private uploads tied to your build.

Loading datasets...
Evaluation rubric

How submissions are scored

These dimensions define what the evaluator checks, how much each dimension matters, and which criteria separate a passable run from a strong one.

Max Score: 6
Dimensions
6 scoring checks
Binary
6 pass or fail dimensions
Ordinal
0 scaled dimensions
Dimension 1pydantic_schema_validation

Pydantic Schema Validation

Ensures the generated summary and recommendations strictly adhere to their Pydantic schemas.

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 2ad_free_content_check

Ad-Free Content Check

Verifies that no output explicitly contains sponsored links or overt advertiser influence.

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 3recommendation_relevance_check

Recommendation Relevance Check

Ensures at least one recommendation is highly relevant to the user's profile and reading history.

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 4summary_factual_accuracy

Summary Factual Accuracy

Measures the correctness of facts presented in the summary (0-1). • target: 0.95 • range: 0.8-1

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 5personalization_score

Personalization Score

Quantifies how well recommendations align with user preferences and history (0-1). • target: 0.9 • range: 0.75-1

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Dimension 6output_generation_latency

Output Generation Latency

Time taken to generate both summary and recommendations (in seconds). • target: 2 • range: 0-5

binary
Weight: 1
Binary check

This dimension contributes its full weight only when the submission satisfies the requirement. Partial credit is not awarded.

Learning goals

What you should walk away with

Master Pydantic AI for defining agent capabilities, structured output generation, and robust schema validation for LLM interactions.

Design comprehensive Pydantic models for content summaries (e.g., title, key points, sentiment) and personalized recommendations (e.g., article ID, relevance score, reason).

Implement a content summarization agent using Mistral-large via Pydantic AI, ensuring outputs strictly conform to the defined schemas and ethical guidelines.

Orchestrate a multi-stage data pipeline with Apache Airflow to ingest new articles, process them, generate embeddings, and update the recommendation engine.

Utilize PostgreSQL with the pgvector extension to store content embeddings and user interaction histories, enabling efficient similarity search for recommendations.

Develop personalized recommendation agents that query the pgvector database based on user profiles and generated summaries, providing relevant and unbiased suggestions.

Build an interactive web interface using Streamlit to showcase the ad-free summaries and personalized recommendations to subscribers, allowing for feedback and interaction.

Start from your terminal
$npx -y @versalist/cli start agent-for-ethical-personalized-content-recommendation

[ok] Wrote CHALLENGE.md

[ok] Wrote .versalist.json

[ok] Wrote eval/examples.json

Requires VERSALIST_API_KEY. Works with any MCP-aware editor.

Docs
Manage API keys
Challenge at a glance
Host and timing
Vera

AI Research & Mentorship

Starts Available now
Evergreen challenge
Your progress

Participation status

You haven't started this challenge yet

Timeline and host

Operating window

Key dates and the organization behind this challenge.

Start date
Available now
Run mode
Evergreen challenge
Explore

Find another challenge

Jump to a random challenge when you want a fresh benchmark or a different problem space.

Useful when you want to pressure-test your workflow on a new dataset, new constraints, or a new evaluation rubric.

Tool Space Recipe

Draft
Evaluation
Rubric: 6 dimensions
·Pydantic Schema Validation(1%)
·Ad-Free Content Check(1%)
·Recommendation Relevance Check(1%)
·Summary Factual Accuracy(1%)
·Personalization Score(1%)
·Output Generation Latency(1%)
Gold items: 1 (1 public)

Frequently Asked Questions about Agent for Ethical, Personalized Content Recommendation