Agent Foundry
All Problems

#26. Tool Rate Limiter

HardTool CallingCost Optimization

The Problem

Your financial analyst agent uses an analyze_company tool that calls an expensive third-party API ($0.05 per call). When a user asks to compare 10 companies, the agent happily makes 10+ API calls — sometimes even more with retries. In production, a single query has triggered 50+ calls, costing over $2. Your job is to add rate limiting so the tool is called at most 5 times per agent run. After hitting the limit, the agent should work with the data it already has.

Examples

Example 1

User input: Compare the top 10 tech companies: Apple, Google, Microsoft, Amazon, Meta, Tesla, Nvidia, Netflix, Adobe, and Salesforce

Current (bad) output: The agent calls analyze_company 10 times (once per company), costing $0.50. If some fail and it retries, it could hit 15+ calls.

Expected (good) output: The agent calls analyze_company for 5 companies, then informs the user: I've analyzed 5 out of 10 companies due to API rate limits. Here's the comparison for Apple, Google, Microsoft, Amazon, and Meta. I can analyze the remaining 5 in a follow-up request. The cost is capped at $0.25.

Example 2

User input: Analyze every company in the S&P 500

Current (bad) output: The agent attempts hundreds of API calls, burning through the budget rapidly.

Expected (good) output: The agent makes 5 calls, presents those results, and explains: I can only analyze 5 companies per request to manage API costs. Here are the first 5 — let me know which others you'd like me to look into next.

Your Task

  • Add rate limiting so the analyze_company tool is called at most 5 times per agent run.
  • When the limit is reached, the tool should return a clear rate-limit message instead of making the API call.
  • The agent should still provide useful output based on the data it gathered within the limit.
  • The rate limit counter must reset between separate agent invocations.

Evaluation

Submissions are checked for the following:

  • API calls capped at 5: The expensive API tool is never called more than 5 times per run.
  • Limit is handled gracefully: The agent informs the user when the limit is reached.
  • Partial results are still useful: The agent provides analysis for the companies it managed to look up.
  • Counter resets between runs: Each new agent invocation starts with a fresh call counter.

Constraints

  • The expensive API tool must not be called more than 5 times per agent run
  • When the limit is reached, the agent must use cached or already-fetched results
  • The agent must inform the user when it has hit the rate limit
  • The rate limit counter must reset between separate agent runs
Starter Code
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import tool

llm = ChatOpenAI(model="gpt-4o-mini")

# BUG: No rate limiting — the agent calls this expensive API as many times as it wants
# In production this has resulted in 50+ calls per run, costing $2+ per query
# TODO: Add rate limiting to cap at 5 calls per run

@tool
def analyze_company(company_name: str) -> str:
    """Call expensive financial analysis API for a company. Costs $0.05 per call."""
    return f"Analysis for {company_name}: Revenue $10B, Growth 15%, P/E Ratio 25, Rating: Buy, Market Cap $150B, Sector: Technology"

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a financial analyst assistant. Use the analyze_company tool to research companies. Analyze each company the user asks about."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_tool_calling_agent(llm, [analyze_company], prompt)
executor = AgentExecutor(agent=agent, tools=[analyze_company])

result = executor.invoke({"input": "Compare the top 10 tech companies: Apple, Google, Microsoft, Amazon, Meta, Tesla, Nvidia, Netflix, Adobe, and Salesforce"})
print(result["output"])
Open in Google Colab
Evaluation Criteria0/4