CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society

Mar 31, 2023•G. Li, Hasan Hammoud, Hani Itani +2•View PDF

TL;DR Highlight

A multi-agent framework where two AI agents role-play and converse to autonomously complete complex tasks without human intervention.

Who Should Read

Backend/AI developers designing LLM-based autonomous agent systems or building multi-agent pipelines. Also useful for ML engineers wanting to auto-generate instruction-following datasets for fine-tuning.

Core Mechanics

Proposes a Role-Playing framework where an AI Assistant (executor) and AI User (director) are given roles, and with just an idea from a human, agents complete the task through conversation alone
Inception Prompting: design only the system prompt before the conversation starts, then agents automatically prompt each other — solves role flipping, infinite loops, and empty reply (flake reply) issues through prompt engineering
Framework generates large-scale instruction-following datasets automatically: AI Society (25,000 conversations), Code, Math (50K), Science (60K) — published on HuggingFace
CAMEL multi-agent solution wins 76.3% of human evaluations against single gpt-3.5-turbo calls on AI Society tasks
Sequentially fine-tuning LLaMA-7B on generated datasets shows gradual knowledge emergence per domain
Critic-in-the-Loop: optional extension to add an AI or human critic agent to the loop for tree-search style decision-making

Evidence

CAMEL agent solution wins 76.3% human evaluations (453 evaluators), 73.0% GPT-4 evaluations vs gpt-3.5-turbo single call on AI Society tasks
Code task GPT-4 evaluation: CAMEL 76.0% win rate vs gpt-3.5-turbo 24.0%
CAMEL-7B (LLaMA-7B fine-tuned): HumanEval pass@1 14.0%, pass@100 57.9% vs LLaMA-7B (10.5%, 36.5%) and Vicuna-7B (11.0%, 42.9%) — significant improvement
Cumulative fine-tuning LLaMA-7B on AI Society → Code → Math → Science: final model wins all 20/20 tasks across domains vs individual models

How to Apply

Define two roles in your project (e.g., 'domain expert' + 'developer'), copy the Inception Prompt template from Figure 2 and apply it as a system prompt for immediate autonomous collaborative agent implementation
When instruction-following fine-tuning data is scarce, set up role combinations for the desired domain with the CAMEL framework and auto-generate conversations as training data — the paper auto-generated 25,000 conversations using two gpt-3.5-turbo instances
If agent loops produce infinite conversations or role flipping: explicitly add prompt guardrails to the system prompt like 'Never flip roles!', a '<CAMEL_TASK_DONE>' termination token, and a max message limit (40 messages)

Code Example

snippet

Terminology

Role-PlayingAssigning roles to AI agents like 'you are a Python developer' or 'you are a stock trader' to drive the conversation. Like giving actors a script and character.

Inception PromptingLike the movie Inception, planting roles and rules in the agent's head before the conversation starts. Set it once and agents automatically prompt each other afterward.

Instruction FollowingThe ability for AI to follow commands like 'do A'. GPT performs task completion rather than just text completion.

Multi-agentA system where multiple AI agents cooperate, compete, or debate to solve complex problems beyond what a single agent can handle.

Autonomous AgentAn AI system that sets sub-goals, plans, and executes sequentially to achieve a goal without human instruction for each step.

Related Resources

Original Abstract (Expand)

The rapid advancement of chat-based language models has led to remarkable progress in complex task-solving. However, their success heavily relies on human input to guide the conversation, which can be challenging and time-consuming. This paper explores the potential of building scalable techniques to facilitate autonomous cooperation among communicative agents, and provides insight into their"cognitive"processes. To address the challenges of achieving autonomous cooperation, we propose a novel communicative agent framework named role-playing. Our approach involves using inception prompting to guide chat agents toward task completion while maintaining consistency with human intentions. We showcase how role-playing can be used to generate conversational data for studying the behaviors and capabilities of a society of agents, providing a valuable resource for investigating conversational language models. In particular, we conduct comprehensive studies on instruction-following cooperation in multi-agent settings. Our contributions include introducing a novel communicative agent framework, offering a scalable approach for studying the cooperative behaviors and capabilities of multi-agent systems, and open-sourcing our library to support research on communicative agents and beyond: https://github.com/camel-ai/camel.