PersonaAgent: When Large Language Model Agents Meet Personalization at Test Time
TL;DR Highlight
A framework that optimizes per-user 'Personas' (system prompts) in real-time to personalize AI agent responses.
Who Should Read
Backend/AI developers adding per-user customized response capabilities to LLM chatbots or agents. Especially useful for those building recommendation, search, or Q&A services that leverage user history.
Core Mechanics
- Auto-generates a unique system prompt (Persona) per user, then updates it in real-time at test time using the gap between expected and actual responses as a 'text gradient'
- Combines Episodic Memory (raw interaction logs) with Semantic Memory (summarized user profiles) for user context management
- Persona serves as a bridge between memory and action selection, enabling coherent personalization across different tasks
Evidence
- LaMP-1 (paper citation prediction) Accuracy: PersonaAgent 0.919 vs 2nd place MemBank 0.862 (~6.6pp improvement)
- LaMP-3 (product rating prediction) MAE: PersonaAgent 0.241 vs 2nd place ICL 0.277 (~13% improvement); RMSE: 0.509 vs 0.543
- Ablation shows removing the Action module causes the biggest performance drop
How to Apply
- For existing RAG chatbots: maintain a separate system prompt file per user, and add a loop that feeds the expected-vs-actual response gap from the last N interactions (paper uses 3) to an LLM to update the system prompt.
- If you have a user profile DB: use existing profiles as Semantic Memory initialization, and add Episodic Memory from session interactions for finer-grained personalization.
Code Example
# PersonaAgent core loop implementation sketch (LangChain-based)
from langchain.chat_models import ChatAnthropic
from langchain.vectorstores import FAISS
from langchain.embeddings import BedrockEmbeddings
# 1. Initialize Persona (system prompt) per user
def init_persona(user_profile_summary: str) -> str:
return f"""You are a helpful personalized assistant.
User summary: {user_profile_summary}
STRICT RULES:
1. Think step-by-step about what information you need.
2. MUST use at least TWO tools to answer the question.
3. Provide clear, concise responses."""
# 2. Episodic Memory storage and retrieval
class EpisodicMemory:
def __init__(self, embeddings):
self.store = {} # user_id -> FAISS index
self.embeddings = embeddings
def add(self, user_id, query, response, metadata={}):
# In practice, store in FAISS or another vector DB
pass
def retrieve(self, user_id, query, top_k=4):
# Return Top-K interactions based on similarity
pass
# 3. Core logic for Test-Time Persona optimization
def compute_textual_gradient(llm, question, agent_response, ground_truth):
"""Have the LLM analyze response differences to generate feedback"""
prompt = f"""You are a meticulous evaluator of personalized AI agent responses.
Analyze the following and give feedback on how to improve the system prompt.
Question: {question}
Expected Answer: {ground_truth}
Agent Response: {agent_response}
Feedback should focus on:
1. How to improve search keywords for this user.
2. User's prior interactions and preferences.
3. Explicit user profile descriptions not specific to this task.
Feedback:"""
return llm.predict(prompt)
def update_persona(llm, current_persona, feedbacks):
"""Aggregate feedback and update Persona"""
aggregated = "\n".join(feedbacks)
prompt = f"""You are a prompt engineering assistant.
Current system prompt: {current_persona}
Provided Feedback: {aggregated}
Generate an updated system prompt that highlights the user's unique preferences.
New system prompt:"""
return llm.predict(prompt)
# 4. Test-time alignment loop
def test_time_alignment(llm, agent, persona, recent_interactions, n_iter=1):
for _ in range(n_iter):
feedbacks = []
for q, gt in recent_interactions:
agent_resp = agent.run(q, system_prompt=persona)
feedback = compute_textual_gradient(llm, q, agent_resp, gt)
feedbacks.append(feedback)
persona = update_persona(llm, persona, feedbacks)
return personaTerminology
Related Resources
Original Abstract (Expand)
Large Language Model (LLM) empowered agents have recently emerged as advanced paradigms that exhibit impressive capabilities in a wide range of domains and tasks. Despite their potential, current LLM agents often adopt a one-size-fits-all approach, lacking the flexibility to respond to users' varying needs and preferences. This limitation motivates us to develop PersonaAgent, the first personalized LLM agent framework designed to address versatile personalization tasks. Specifically, PersonaAgent integrates two complementary components - a personalized memory module that includes episodic and semantic memory mechanisms; a personalized action module that enables the agent to perform tool actions tailored to the user. At the core, the persona (defined as unique system prompt for each user) functions as an intermediary: it leverages insights from personalized memory to control agent actions, while the outcomes of these actions in turn refine the memory. Based on the framework, we propose a test-time user-preference alignment strategy that simulate the latest n interactions to optimize the persona prompt, ensuring real-time user preference alignment through textual loss feedback between simulated and ground-truth responses. Experimental evaluations demonstrate that PersonaAgent significantly outperforms other baseline methods by not only personalizing the action space effectively but also scaling during test-time real-world applications. These results underscore the feasibility and potential of our approach in delivering tailored, dynamic user experiences.