LLMPC: Large Language Model Predictive Control
TL;DR Highlight
A framework that interprets LLMs through MPC (Model Predictive Control) from control theory to dramatically improve planning performance.
Who Should Read
Developers looking to boost LLM agent planning accuracy, especially AI engineers solving constraint-heavy tasks like travel itinerary or meeting scheduling.
Core Mechanics
- Mathematically proved that LLMs implicitly act like optimization algorithms minimizing a cost function when given planning prompts
- Applied MPC (Model Predictive Control) to have LLMs generate multiple candidate plans simultaneously, evaluated by an objective function to pick the best
- Instead of one-shot few-shot prompting, introduced an iterative refinement loop that feeds back why the previous plan failed
- With GPT-4o-mini on travel planning, improved success rate from 14.5% to 44.6% vs single-round GPT-4o
- For meeting scheduling, achieved 52.5% to 67% success with T=9, K=3 (9 iterations + 3 candidates each)
- In physics simulation (spring-mass system), increasing candidates from 1 to 15 shrank cost ratio vs MPC from 8.21x to 1.30x
Evidence
- Travel planning success: GPT-4o single round 14.5% vs LLMPC T=7 44.6% (~3x improvement)
- Meeting scheduling: GPT-4o single round 52.5% vs LLMPC T=9, K=3 67% (+14.5pp)
- Spring-mass system: K=1 cost ratio 8.21 vs MPC, K=15 dropped to 1.30 (84% gap reduction)
- LLMPC advantage grows with city count — more complex problems benefit more from higher T and K
How to Apply
- For constraint-heavy planning (scheduling, routing), ask the LLM for K candidates simultaneously instead of one answer, then pick the best with a separate evaluation function.
- Build an evaluation function that automatically extracts which constraints failed in the previous plan, and feed that as feedback_string into the next prompt in an iterative loop.
- For scheduling bots or travel planner apps: add few-shot examples to the system prompt, put current plan + failed constraint list in the instruction prompt, and iterate T times.
Code Example
import openai
import json
def evaluate_plan(plan: str, constraints: dict) -> tuple[float, list[str]]:
"""Evaluate how well a plan satisfies the constraints. Returns list of violations"""
violations = []
# Implement domain-specific constraint checking logic here
cost = len(violations) # Use number of violations as cost
return cost, violations
def llmpc_plan(
task: str,
constraints: dict,
system_prompt: str,
max_iterations: int = 7,
plans_per_iter: int = 3,
model: str = "gpt-4o"
) -> str:
client = openai.OpenAI()
best_plan = ""
best_cost = float("inf")
feedback_string = ""
for step in range(1, max_iterations + 1):
# Request K candidates using current state + feedback
instruction = f"""
STEP {step}/{max_iterations}
TASK: {task}
Your current best plan is:
{best_plan if best_plan else 'No plan yet.'}
{feedback_string}
Propose {plans_per_iter} different plans separated by '---'.
"""
response = client.chat.completions.create(
model=model,
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": instruction}
]
)
raw = response.choices[0].message.content
candidates = raw.split("---")
# Evaluate each candidate and select the best (MPC cost function selection step)
for candidate in candidates:
candidate = candidate.strip()
if not candidate:
continue
cost, violations = evaluate_plan(candidate, constraints)
if cost < best_cost:
best_cost = cost
best_plan = candidate
feedback_string = "Unmet constraints:\n" + "\n".join(f"- {v}" for v in violations) if violations else ""
if best_cost == 0:
print(f"Perfect plan found at iteration {step}!")
break
return best_plan
# Usage example
task = "Visit 5 cities in 10 days with given flight constraints..."
constraints = {"days": 10, "cities": ["Paris", "London", "Rome", "Berlin", "Madrid"]}
system_prompt = "You are an expert travel planner. Propose valid trip plans."
result = llmpc_plan(task, constraints, system_prompt, max_iterations=7, plans_per_iter=3)Terminology
Related Resources
Original Abstract (Expand)
Recent advancements in planning prompting techniques for Large Language Models have improved their reasoning, planning, and action abilities. This paper develops a planning framework for Large Language Models using model predictive control that enables them to iteratively solve complex problems with long horizons. We show that in the model predictive control formulation, LLM planners act as approximate cost function optimizers and solve complex problems by breaking them down into smaller iterative steps. With our proposed planning framework, we demonstrate improved performance over few-shot prompting and improved efficiency over Monte Carlo Tree Search on several planning benchmarks.