Agent Foundry
All Problems

#23. Dynamic Tool Registry

MediumTool Calling

The Problem

Your universal assistant agent has 15 tools spanning web search, finance, weather, email, calendar, translation, and more. Every query — no matter how simple — sends all 15 tool schemas to the LLM. This wastes tokens, increases latency, and confuses the model (it sometimes picks the wrong tool when overwhelmed with options). Your job is to dynamically select only the 3-4 most relevant tools per query before passing them to the agent.

Examples

Example 1

User input: What's the weather in Paris?

Current (bad) output: The LLM receives schemas for all 15 tools. It works, but wastes tokens sending irrelevant tools like calculate_mortgage, translate_text, and send_email. Occasionally it picks the wrong tool.

Expected (good) output: The agent receives only get_weather, get_forecast, and search_web. It quickly calls get_weather("Paris") and responds accurately.

Example 2

User input: What's AAPL stock price and convert 100 USD to EUR?

Current (bad) output: All 15 tools are loaded. The LLM may get confused between calculate_tip and get_exchange_rate, or miss the stock lookup entirely.

Expected (good) output: The agent receives get_stock_price, get_exchange_rate, search_web, and convert_units — just the relevant tools. It handles both parts of the query correctly.

Your Task

  • Build a tool selection layer that picks the 3-4 most relevant tools from the registry based on the user's query.
  • The selection mechanism can use keyword matching, semantic similarity, or an LLM-based router.
  • Pass only the selected tools to the agent for each query.
  • Ensure the system still works for queries that span multiple categories.

Evaluation

Submissions are checked for the following:

  • Only relevant tools are exposed: The LLM receives 3-4 tools per query, not all 15.
  • Selection is query-driven: Tool selection is based on the user's query, not hardcoded.
  • Selected tools are appropriate: The exposed tools match the intent of the user's query.

Constraints

  • The agent must have access to all 15 tools in the registry
  • Only 3-4 relevant tools should be exposed to the LLM per query
  • Tool selection must be based on the user's query, not hardcoded
  • The agent must still work correctly if a query spans multiple categories
Starter Code
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import tool

llm = ChatOpenAI(model="gpt-4o-mini")

@tool
def search_web(query: str) -> str:
    """Search the web for information."""
    return f"Web results for: {query}"

@tool
def search_news(query: str) -> str:
    """Search recent news articles."""
    return f"News results for: {query}"

@tool
def get_stock_price(ticker: str) -> str:
    """Get current stock price for a ticker symbol."""
    return f"{ticker}: $150.25"

@tool
def get_exchange_rate(from_currency: str, to_currency: str) -> str:
    """Get exchange rate between two currencies."""
    return f"1 {from_currency} = 0.85 {to_currency}"

@tool
def get_weather(city: str) -> str:
    """Get current weather for a city."""
    return f"{city}: 72°F, Sunny"

@tool
def get_forecast(city: str, days: int) -> str:
    """Get weather forecast for upcoming days."""
    return f"{days}-day forecast for {city}: Sunny, Cloudy, Rain"

@tool
def calculate_mortgage(principal: float, rate: float, years: int) -> str:
    """Calculate monthly mortgage payment."""
    return f"Monthly payment: $1,200"

@tool
def calculate_tip(amount: float, percent: float) -> str:
    """Calculate tip amount."""
    return f"Tip: ${amount * percent / 100:.2f}"

@tool
def translate_text(text: str, target_lang: str) -> str:
    """Translate text to target language."""
    return f"Translated to {target_lang}: [translated text]"

@tool
def summarize_text(text: str) -> str:
    """Summarize a long piece of text."""
    return "Summary: [condensed version]"

@tool
def send_email(to: str, subject: str, body: str) -> str:
    """Send an email."""
    return f"Email sent to {to}"

@tool
def create_calendar_event(title: str, date: str) -> str:
    """Create a calendar event."""
    return f"Event '{title}' created for {date}"

@tool
def set_reminder(message: str, time: str) -> str:
    """Set a reminder."""
    return f"Reminder set: {message} at {time}"

@tool
def get_definition(word: str) -> str:
    """Look up the definition of a word."""
    return f"Definition of {word}: [definition]"

@tool
def convert_units(value: float, from_unit: str, to_unit: str) -> str:
    """Convert between units of measurement."""
    return f"{value} {from_unit} = {value * 2.54} {to_unit}"

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant with many tools available."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

# BUG: All 15 tools are passed to every query — the LLM gets confused with too many options
# TODO: Dynamically select only the 3-4 most relevant tools per query
all_tools = [search_web, search_news, get_stock_price, get_exchange_rate, get_weather, get_forecast, calculate_mortgage, calculate_tip, translate_text, summarize_text, send_email, create_calendar_event, set_reminder, get_definition, convert_units]
agent = create_tool_calling_agent(llm, all_tools, prompt)
executor = AgentExecutor(agent=agent, tools=all_tools)

result = executor.invoke({"input": "What's the weather in Paris?"})
print(result["output"])
Open in Google Colab
Evaluation Criteria0/3