The Problem
Your current RAG pipeline is a rigid sequence: retrieve, then answer. Every query—no matter how trivial—goes through the vector store. Ask "What is 2 + 2?" and it retrieves irrelevant enterprise documents before answering. Ask a complex question and it retrieves once, gets a partial result, and gives an incomplete answer because it can't decide to search again with a better query. The pipeline lacks agency: the ability to decide when to retrieve, what query to use, and whether the results are good enough. Your job is to build an agentic RAG system where the agent makes these decisions dynamically.
Examples
Example 1
User input: What is the enterprise API rate limit?
Current (bad) output: Retrieves documents and answers, but the pipeline has no ability to evaluate if the retrieval was sufficient—it always returns whatever the first search yields.
Expected (good) output: The agent decides this question needs retrieval, searches the knowledge base, finds the rate limit document, and answers: The enterprise API rate limit is 10,000 requests per minute.
Example 2
User input: What is 2 + 2?
Current (bad) output: Retrieves irrelevant enterprise documents, then tries to answer "2 + 2" from them—wasting time and potentially confusing the answer with irrelevant context.
Expected (good) output: The agent recognizes this is a simple math question, skips retrieval entirely, and answers: 4.
Example 3
User input: What security certifications does the company have, and in which data regions can I store data?
Current (bad) output: Retrieves one document that partially answers the question, misses the other.
Expected (good) output: The agent retrieves, evaluates the results, sees it needs more information, retrieves again with a refined query, and combines: The company holds SOC 2 Type II certification (renewed January 2024). Data residency options include US-East, EU-West, and AP-Southeast regions.
Your Task
Build an agentic RAG system where the agent:
- Classifies incoming queries to decide whether retrieval is needed.
- Formulates its own search queries (not just passing the raw user question).
- Evaluates retrieved results and decides if they're sufficient or if another retrieval round is needed.
- Answers simple questions directly without unnecessary retrieval.
Evaluation
Submissions are checked for the following:
- Decides when to retrieve: The agent intelligently decides whether a query requires document retrieval or can be answered directly.
- Formulates search queries: The agent generates appropriate search queries rather than passing raw user input.
- Evaluates retrieval sufficiency: The agent checks whether retrieved results are sufficient and can retrieve again if needed.
- Routes queries correctly: Knowledge questions go to retrieval while simple questions are answered directly.