LLMPC: Large Language Model Predictive Control

Jan 5, 2025•Gabriel Maher•View PDF

TL;DR Highlight

A framework that interprets LLMs through MPC (Model Predictive Control) from control theory to dramatically improve planning performance.

Who Should Read

Developers looking to boost LLM agent planning accuracy, especially AI engineers solving constraint-heavy tasks like travel itinerary or meeting scheduling.

Core Mechanics

Mathematically proved that LLMs implicitly act like optimization algorithms minimizing a cost function when given planning prompts
Applied MPC (Model Predictive Control) to have LLMs generate multiple candidate plans simultaneously, evaluated by an objective function to pick the best
Instead of one-shot few-shot prompting, introduced an iterative refinement loop that feeds back why the previous plan failed
With GPT-4o-mini on travel planning, improved success rate from 14.5% to 44.6% vs single-round GPT-4o
For meeting scheduling, achieved 52.5% to 67% success with T=9, K=3 (9 iterations + 3 candidates each)
In physics simulation (spring-mass system), increasing candidates from 1 to 15 shrank cost ratio vs MPC from 8.21x to 1.30x

Evidence

Travel planning success: GPT-4o single round 14.5% vs LLMPC T=7 44.6% (~3x improvement)
Meeting scheduling: GPT-4o single round 52.5% vs LLMPC T=9, K=3 67% (+14.5pp)
Spring-mass system: K=1 cost ratio 8.21 vs MPC, K=15 dropped to 1.30 (84% gap reduction)
LLMPC advantage grows with city count — more complex problems benefit more from higher T and K

How to Apply

For constraint-heavy planning (scheduling, routing), ask the LLM for K candidates simultaneously instead of one answer, then pick the best with a separate evaluation function.
Build an evaluation function that automatically extracts which constraints failed in the previous plan, and feed that as feedback_string into the next prompt in an iterative loop.
For scheduling bots or travel planner apps: add few-shot examples to the system prompt, put current plan + failed constraint list in the instruction prompt, and iterate T times.

Code Example

snippet

import openai
import json

def evaluate_plan(plan: str, constraints: dict) -> tuple[float, list[str]]:
    """Evaluate how well a plan satisfies the constraints. Returns list of violations"""
    violations = []
    # Implement domain-specific constraint checking logic here
    cost = len(violations)  # Use number of violations as cost
    return cost, violations

def llmpc_plan(
    task: str,
    constraints: dict,
    system_prompt: str,
    max_iterations: int = 7,
    plans_per_iter: int = 3,
    model: str = "gpt-4o"
) -> str:
    client = openai.OpenAI()
    best_plan = ""
    best_cost = float("inf")
    feedback_string = ""

    for step in range(1, max_iterations + 1):
        # Request K candidates using current state + feedback
        instruction = f"""
STEP {step}/{max_iterations}
TASK: {task}

Your current best plan is:
{best_plan if best_plan else 'No plan yet.'}

{feedback_string}

Propose {plans_per_iter} different plans separated by '---'.
"""
        response = client.chat.completions.create(
            model=model,
            messages=[
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": instruction}
            ]
        )
        
        raw = response.choices[0].message.content
        candidates = raw.split("---")
        
        # Evaluate each candidate and select the best (MPC cost function selection step)
        for candidate in candidates:
            candidate = candidate.strip()
            if not candidate:
                continue
            cost, violations = evaluate_plan(candidate, constraints)
            if cost < best_cost:
                best_cost = cost
                best_plan = candidate
                feedback_string = "Unmet constraints:\n" + "\n".join(f"- {v}" for v in violations) if violations else ""
        
        if best_cost == 0:
            print(f"Perfect plan found at iteration {step}!")
            break
    
    return best_plan

# Usage example
task = "Visit 5 cities in 10 days with given flight constraints..."
constraints = {"days": 10, "cities": ["Paris", "London", "Rome", "Berlin", "Madrid"]}
system_prompt = "You are an expert travel planner. Propose valid trip plans."
result = llmpc_plan(task, constraints, system_prompt, max_iterations=7, plans_per_iter=3)

Terminology

MPCModel Predictive Control. A control technique from robotics/automotive that simulates multiple future steps and selects the best sequence of actions — like looking ahead several moves in chess.

cost functionA function that quantifies how good a plan is numerically. Lower score = better plan. Encodes constraint violations and distance to goal.

few-shot promptingShowing the LLM a few examples and having it answer in that pattern — like showing practice problems before an exam.

iterative refinementAn iterative improvement process: if the first answer is wrong, give feedback on why and try again. Similar to getting code review and revising.

planning horizon (H)How many steps ahead to plan. H=3 means plan 3 steps, execute, then replan.

CVXPYA Python library for solving mathematical optimization problems. Used in this paper as a baseline to compare with LLM performance.

Natural Plan benchmarkA public benchmark dataset for evaluating LLM planning ability. Includes real-world planning problems like travel itinerary and meeting scheduling.

Related Resources

Original Abstract (Expand)

Recent advancements in planning prompting techniques for Large Language Models have improved their reasoning, planning, and action abilities. This paper develops a planning framework for Large Language Models using model predictive control that enables them to iteratively solve complex problems with long horizons. We show that in the model predictive control formulation, LLM planners act as approximate cost function optimizers and solve complex problems by breaking them down into smaller iterative steps. With our proposed planning framework, we demonstrate improved performance over few-shot prompting and improved efficiency over Monte Carlo Tree Search on several planning benchmarks.