A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval
TL;DR Highlight
The first comprehensive survey systematically covering how LLM Agents are transforming recommendation systems and search engines
Who Should Read
ML engineers and backend developers looking to integrate LLMs into recommendation systems or search engines. Anyone who wants to design or study AI agent-based IR system architectures.
Core Mechanics
- LLM Agent roles in recommendation/search are broadly categorized into 4 types: User Interaction (conversational interface), Representation Optimization (improving user/item representations), System Integration (acting as the brain of the recommendation system), and Environment Simulation (user simulator)
- In search systems, 5 roles are identified: Task Decomposer (breaking down complex search tasks), Query Rewriter (improving queries), Action Executor (tool calling), Results Synthesizer (summarizing results), and User Simulator (simulating user behavior)
- LLM Agents can function as user simulators to test recommendation/search algorithms without real users — enabling evaluation without A/B testing costs or actual UX degradation
- Chain-of-Thought (CoT) reasoning and long context windows are leveraged to decompose complex multi-step search tasks like 'travel planning' into subtasks for processing
- Embodied Agents (agents that interact in real time with physical or cyber environments) are emerging as the next-generation paradigm for recommendation/search — enabling GUI manipulation, app navigation, and on-device personalization
- Key unresolved challenges including hallucination, bias, deployment costs, personalization, and multimodal processing are outlined, along with future research directions
Evidence
- In CompWoB benchmark experiments, GPT-3.5-turbo and GPT-4 achieved an average success rate of 94.0% on simple web tasks, but dropped sharply to 24.9% on composite tasks
- Agent4Rec generates 1,000 LLM agents initialized with the MovieLens-1M dataset to evaluate recommendation algorithms, using agent feedback as iterative training data
- USimAgent (a search user behavior simulator) outperforms existing methods on query generation and achieves comparable performance on click/stop behavior prediction
- The iEvaLM framework achieves performance improvements over existing evaluation methodologies on 2 public CRS (Conversational Recommendation System) datasets and adds an interpretability evaluation dimension for recommendations
How to Apply
- When adding an LLM Agent as a 'System Integration' layer to an existing recommendation system, follow patterns like RecMind or InteRecAgent — keep the existing ID-based recommendation model as a tool and let the LLM handle only natural language understanding and planning, enabling adoption without a full system redesign
- To quickly evaluate recommendation/search algorithms without A/B testing, build an LLM-based user simulator with user profile, memory, and behavior modules (like Agent4Rec) and plug it into your algorithm validation pipeline
- When handling complex search queries (e.g., '3-night 4-day Europe travel planning'), apply the Task Decomposer pattern to design an agent that automatically decomposes the task into subtasks — destination selection → itinerary → flights/hotels → budget calculation — and calls the appropriate external APIs for each
Code Example
# InteRecAgent style: example pattern using LLM as the interface for a recommendation system
from openai import OpenAI
client = OpenAI()
# Register the traditional recommendation model as a tool
tools = [
{
"type": "function",
"function": {
"name": "get_recommendations",
"description": "Extract candidate items using an ID-based collaborative filtering model",
"parameters": {
"type": "object",
"properties": {
"user_id": {"type": "string"},
"top_k": {"type": "integer", "default": 10}
},
"required": ["user_id"]
}
}
},
{
"type": "function",
"function": {
"name": "rerank_by_context",
"description": "Re-rank candidate items reflecting the current conversation context",
"parameters": {
"type": "object",
"properties": {
"items": {"type": "array"},
"user_context": {"type": "string"}
},
"required": ["items", "user_context"]
}
}
}
]
def recommendation_agent(user_id: str, user_message: str, history: list):
"""LLM Agent calls recommendation tools to generate personalized recommendations"""
system_prompt = """You are a personalized recommendation assistant.
Analyze the user's request and call the appropriate recommendation tools to provide optimal recommendations.
If needed, call multiple tools sequentially to refine the results."""
messages = [{"role": "system", "content": system_prompt}]
messages.extend(history)
messages.append({"role": "user", "content": user_message})
response = client.chat.completions.create(
model="gpt-4",
messages=messages,
tools=tools,
tool_choice="auto"
)
return response
# User simulator pattern (Agent4Rec style)
user_simulator_prompt = """
You are simulating a user with the following profile:
- Age: 28
- Preferred genres: Sci-Fi, Thriller
- Recently watched: [Inception, Interstellar, Tenet]
- Tendencies: Prefers complex storylines, dislikes Romance
The recommendation system has recommended the following movies: {recommended_items}
Respond as a real user would:
1. Which items you would click and why
2. Star rating (1-5)
3. Additional feedback
"""Terminology
Related Resources
Original Abstract (Expand)
Information technology has profoundly altered the way humans interact with information. The vast amount of content created, shared, and disseminated online has made it increasingly difficult to access relevant information. Over the past two decades, recommender systems and search (collectively referred to as information retrieval systems) have evolved significantly to address these challenges. Recent advances in large language models (LLMs) have demonstrated capabilities that surpass human performance in various language-related tasks and exhibit general understanding, reasoning, and decision-making abilities. This paper explores the transformative potential of LLM agents in enhancing recommender and search systems. We discuss the motivations and roles of LLM agents, and establish a classification framework to elaborate on the existing research. We highlight the immense potential of LLM agents in addressing current challenges in recommendation and search, providing insights into future research directions. This paper is the first to systematically review and classify the research on LLM agents in these domains, offering a novel perspective on leveraging this advanced AI technology for information retrieval. To help understand the existing works, we list the existing papers on LLM agent based recommendation and search at this link: https://github.com/tsinghua-fib-lab/LLM-Agent-for-Recommendation-and-Search.