Design LlamaIndex Data Connectors

implementationChallenge

Prompt Content

Using LlamaIndex, design and implement a data ingestion pipeline that can pull data from a simulated legal document repository (e.g., local PDF files for filings), a news API (simulated with local JSON files), and a public company website (simulated via web scraping of a local HTML file).

Your task is to use `SimpleDirectoryReader`, `WebPageReader`, or custom `BaseReader` implementations to load these documents, and then create a `VectorStoreIndex` using `PineconeVectorStore` for efficient retrieval. Provide Python code snippets for initializing LlamaIndex, defining your readers, and setting up the index.

Try this prompt

Open the workspace to execute this prompt with free credits, or use your own API keys for unlimited usage.

Usage Tips

Copy the prompt and paste it into your preferred AI tool (Claude, ChatGPT, Gemini)

Customize placeholder values with your specific requirements and context

For best results, provide clear examples and test different variations