Custom Middleware
Custom Middleware
Middleware in LangChain agents intercepts the processing pipeline at defined hook points, letting you modify inputs, outputs, and behavior without changing agent or model code. You can log requests, enforce policies, short-circuit execution, and stack multiple middleware layers for defense in depth.
AgentMiddleware Class
All custom middleware extends AgentMiddleware. Override its hook methods to inject logic at specific points in the agent lifecycle:
from langgraph.prebuilt.middleware import AgentMiddleware
class LoggingMiddleware(AgentMiddleware):
def before_agent(self, state):
print(f"[INPUT] {state['messages'][-1].content}")
return state
def after_agent(self, state):
print(f"[OUTPUT] {state['messages'][-1].content}")
return stateHook Points
AgentMiddleware provides four hook points that fire at different stages of processing:
| Hook | When It Fires | Common Uses |
|---|---|---|
before_agent | Before the agent processes the message | Input validation, logging, content filtering |
after_agent | After the agent produces a response | Output filtering, response formatting, auditing |
before_model | Before the LLM call | Prompt injection, token counting, rate limiting |
after_model | After the LLM returns | Response caching, cost tracking, quality checks |
class FullLifecycleMiddleware(AgentMiddleware):
def before_agent(self, state):
print("1. Before agent processes input")
return state
def after_agent(self, state):
print("4. After agent produces response")
return state
def before_model(self, state):
print("2. Before LLM call")
return state
def after_model(self, state):
print("3. After LLM returns")
return state@hook_config Decorator
Use @hook_config to configure hook behavior — for example, specifying which message types a hook should process:
from langgraph.prebuilt.middleware import AgentMiddleware, hook_config
class SelectiveMiddleware(AgentMiddleware):
@hook_config(message_types=["human"])
def before_agent(self, state):
print("Only runs for human messages")
return state
@hook_config(message_types=["ai"])
def after_agent(self, state):
print("Only runs for AI responses")
return statecan_jump_to for Short-Circuiting
The can_jump_to method lets middleware short-circuit the agent pipeline. If a condition is met, the middleware can skip agent processing entirely and return a response directly:
class CacheMiddleware(AgentMiddleware):
def __init__(self):
self.cache = {}
def before_agent(self, state):
user_msg = state["messages"][-1].content
if user_msg in self.cache:
return self.can_jump_to(
state,
response=self.cache[user_msg],
)
return state
def after_agent(self, state):
user_msg = state["messages"][-2].content
response = state["messages"][-1].content
self.cache[user_msg] = response
return stateLogging Middleware Example
A complete logging middleware that records request/response pairs with timestamps:
import time
from langgraph.prebuilt.middleware import AgentMiddleware
class TimedLoggingMiddleware(AgentMiddleware):
def __init__(self):
self.logs = []
def before_agent(self, state):
state["_start_time"] = time.time()
self.logs.append({
"type": "input",
"content": state["messages"][-1].content,
"timestamp": time.time(),
})
return state
def after_agent(self, state):
elapsed = time.time() - state.get("_start_time", time.time())
self.logs.append({
"type": "output",
"content": state["messages"][-1].content,
"timestamp": time.time(),
"elapsed_seconds": round(elapsed, 2),
})
return stateStacking Multiple Middleware
Pass a list of middleware to create_react_agent. They execute in order — each one processes the output of the previous:
from langchain.chat_models import init_chat_model
from langgraph.prebuilt import create_react_agent
model = init_chat_model("gpt-4o-mini", model_provider="openai")
logger = TimedLoggingMiddleware()
class UppercaseMiddleware(AgentMiddleware):
def after_agent(self, state):
state["messages"][-1].content = state["messages"][-1].content.upper()
return state
agent = create_react_agent(
model=model,
tools=[],
prompt="You are a helpful assistant.",
middleware=[logger, UppercaseMiddleware()],
)The logging middleware records the request, the agent processes it, then the uppercase middleware transforms the output, and finally the logger records the transformed response.
Using Middleware with the Agent
from langchain_core.messages import HumanMessage
result = agent.invoke({
"messages": [HumanMessage(content="What is the capital of France?")]
})
print(result["messages"][-1].content)
print("\nLogs:")
for log in logger.logs:
print(f" [{log['type']}] {log['content'][:50]}...")Key Takeaways
AgentMiddlewareprovides four hook points:before_agent,after_agent,before_model,after_model@hook_configconfigures which message types a hook processescan_jump_toshort-circuits the pipeline to return cached or pre-computed responses- Middleware stacks execute in order — each processes the previous one's output
- Use
before_agentfor input validation and filtering;after_agentfor output transformation before_model/after_modelhooks target the LLM call specifically for prompt injection or cost tracking