Multi-Index RAG - Problems

#57. Multi-Index RAG

HardRAG

The Problem

Your application serves two audiences: end users who ask FAQ-style questions ("How do I reset my password?") and developers who ask technical questions ("What is the API endpoint for password reset?"). Both sets of documents are dumped into a single vector store index. When an end user asks how to reset their password, the retriever returns a mix of the FAQ answer and raw API endpoint documentation—confusing the user. When a developer asks for the API endpoint, they get the FAQ answer instead of the technical spec. Your job is to split the documents into separate indexes and route queries to the correct one based on query type.

Examples

Example 1

User input: How do I reset my password?

Current (bad) output: A confusing mix of "Go to Settings > Security > Reset Password" and "POST /api/v2/auth/reset-password - Requires email field" — the user doesn't need API details.

Expected (good) output: Go to Settings > Security > Reset Password. (Answer from FAQ index only.)

Example 2

User input: What is the API endpoint for password reset?

Current (bad) output: Returns the FAQ "Go to Settings" answer instead of the API specification.

Expected (good) output: POST /api/v2/auth/reset-password — Requires email field. Returns 200 with reset token. (Answer from technical index only.)

Example 3

User input: What payment methods do you accept?

Current (bad) output: Might return billing API documentation alongside the actual FAQ answer about Visa, Mastercard, and PayPal.

Expected (good) output: We accept Visa, Mastercard, and PayPal.

Your Task

Restructure the RAG pipeline to:

Store FAQ documents and technical documents in separate vector store indexes.
Add a routing step that classifies each query as FAQ or technical before retrieval.
Search only the appropriate index based on the query classification.
Return clean, relevant answers from the correct document set.

Evaluation

Submissions are checked for the following:

Uses separate indexes: FAQ and technical documents are stored in separate vector store indexes.
Routes queries to correct index: The agent correctly routes FAQ questions to the FAQ index and technical questions to the technical index.
Returns accurate answers: Answers are sourced from the correct index and are accurate for the query type.

Constraints

FAQ and technical documentation must be stored in separate vector store indexes
The agent must route queries to the correct index based on query type
The routing decision must happen before retrieval, not after
Both indexes must remain searchable for their respective query types

from langchain_openai import ChatOpenAI, OpenAIEmbeddings from langchain_core.prompts import ChatPromptTemplate from langchain_community.vectorstores import FAISS from langchain_core.documents import Document llm = ChatOpenAI(model="gpt-4o-mini") embeddings = OpenAIEmbeddings() faq_docs = [ Document(page_content="Q: How do I reset my password? A: Go to Settings > Security > Reset Password."), Document(page_content="Q: What payment methods do you accept? A: We accept Visa, Mastercard, and PayPal."), Document(page_content="Q: How do I cancel my subscription? A: Go to Billing > Manage Subscription > Cancel."), ] technical_docs = [ Document(page_content="POST /api/v2/auth/reset-password - Requires email field. Returns 200 with reset token."), Document(page_content="GET /api/v2/billing/subscription - Returns subscription object with status, plan, and next_billing_date."), Document(page_content="Authentication: All API requests require Bearer token in Authorization header."), ] # BUG: Everything dumped into a single index — FAQ and technical docs get mixed all_docs = faq_docs + technical_docs vectorstore = FAISS.from_documents(all_docs, embeddings) retriever = vectorstore.as_retriever() prompt = ChatPromptTemplate.from_messages([ ("system", "Answer based on context.\n\nContext: {context}"), ("human", "{question}"), ]) def ask(question: str) -> str: docs = retriever.invoke(question) context = "\n".join([doc.page_content for doc in docs]) chain = prompt | llm result = chain.invoke({"context": context, "question": question}) return result.content # FAQ question gets technical API docs mixed in print(ask("How do I reset my password?")) # Technical question gets FAQ mixed in print(ask("What is the API endpoint for password reset?"))