Multi-Agent Collaboration Mechanisms: LLM 기반 멀티에이전트 시스템 서베이

Multi-Agent Collaboration Mechanisms: A Survey of LLMs

Jan 10, 2025•Khanh-Tung Tran, Dung Dao, Minh-Duong Nguyen +3•View PDF

TL;DR Highlight

LLM 여러 개를 협력시키는 방법론(협력 타입·구조·전략·조율)을 체계적으로 정리한 종합 서베이.

Who Should Read

AutoGen, CrewAI, LangGraph 같은 멀티에이전트 프레임워크를 도입하거나 설계 중인 백엔드·AI 엔지니어. 에이전트를 단순 단일 LLM 호출에서 협력 시스템으로 확장하려는 개발자.

Core Mechanics

에이전트 간 협력 채널을 타입(협력·경쟁·코오퍼티션), 구조(중앙집중·분산·계층), 전략(규칙 기반·역할 기반·모델 기반)으로 분류하는 통합 프레임워크 제시
역할 기반 전략(MetaGPT, AgentVerse)은 전문화된 서브태스크에 강하고, 규칙 기반은 예측 가능성이 필요한 곳에, 모델 기반(Theory of Mind 활용)은 불확실한 동적 환경에 적합
경쟁 구조(디베이트, Critic-Explainer 패턴)가 단순 협력보다 추론 품질을 높이지만, 설계가 나쁘면 강한 프롬프트의 단일 에이전트에 성능이 밀릴 수 있음
MoE(Mixture of Experts, 전문가 혼합 모델)는 에이전트들이 서로 경쟁하며 최적 출력을 선택하는 '코오퍼티션'의 대표 사례
한 에이전트의 환각(hallucination)이 다른 에이전트로 전파·증폭되는 캐스케이딩 문제가 멀티에이전트 시스템의 핵심 리스크
AutoGen, CAMEL, CrewAI, OpenAI Swarm, Microsoft Magentic-One 등 오픈소스 프레임워크가 실제 적용을 가속화 중

Evidence

Orca-AgentInstruct의 멀티에이전트 합성 데이터로 Mistral-7B 파인튜닝 시 여러 벤치마크에서 최대 54% 성능 향상
Agent-as-a-Judge 프레임워크가 DevAI 벤치마크(55개 AI 개발 태스크)에서 기존 LLM-as-a-Judge 대비 인간 전문가 평가와 더 높은 일치율 기록
DyLAN(Dynamic LLM-Agent Network)은 낮은 기여도 에이전트를 동적으로 비활성화해 최종 답변 품질 향상
멀티에이전트 디베이트(여러 라운드 토론)가 단일 모델 대비 사실성·추론 능력을 향상시킴(Du et al., 2023 실험 기반)

How to Apply

코드 생성 파이프라인을 만들 때 MapCoder처럼 역할 기반 정적 아키텍처(리콜→플래닝→코딩→디버깅 에이전트 순차 연결)로 설계하면 단일 LLM 호출보다 오류가 줄어듦
LLM 응답 품질을 높이고 싶다면 Explainer + Critic 경쟁 채널 패턴을 적용: 첫 번째 에이전트가 답을 생성하고 두 번째 에이전트가 반박·검증하는 루프를 추가
동적 태스크가 많은 시스템이라면 Magentic-One처럼 Orchestrator 에이전트가 DAG(방향 비순환 그래프)를 런타임에 생성해 서브 에이전트에게 동적으로 작업을 위임하는 구조를 검토

Code Example

snippet

# AutoGen으로 Critic-Actor 경쟁 채널 패턴 구현 예시
import autogen

config_list = [{"model": "gpt-4", "api_key": "YOUR_KEY"}]

actor = autogen.AssistantAgent(
    name="Actor",
    system_message="당신은 주어진 문제에 대한 답변을 생성합니다.",
    llm_config={"config_list": config_list},
)

critic = autogen.AssistantAgent(
    name="Critic",
    system_message="당신은 Actor의 답변에서 논리적 오류, 환각, 누락된 엣지케이스를 찾아 구체적으로 지적합니다. 좋은 점도 인정하세요.",
    llm_config={"config_list": config_list},
)

user_proxy = autogen.UserProxyAgent(
    name="User",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=4,  # Actor-Critic 2라운드
    code_execution_config=False,
)

# Actor가 먼저 답변 → Critic이 검토 → Actor가 개선 → 종료
user_proxy.initiate_chat(
    actor,
    message="Python으로 이진 탐색 함수를 작성하고, Critic에게 리뷰를 요청하세요.",
)

Terminology

MASMulti-Agent System. 여러 AI 에이전트가 서로 소통하며 협력해 복잡한 문제를 푸는 시스템. 팀 프로젝트에서 각자 역할을 맡아 일하는 것과 같음.

Coopetition협력(Cooperation)과 경쟁(Competition)을 동시에 하는 상태. 같은 팀인데 일부 영역에서는 서로 경쟁해 최선의 결과를 뽑아내는 방식.

Federated Learning각 에이전트(또는 기기)가 자신의 데이터를 외부에 공유하지 않고, 모델 가중치만 중앙 서버에 모아 학습하는 분산 학습 방법. 개인정보를 지키면서 모델을 개선할 수 있음.

Theory of Mind다른 에이전트의 목표·의도·믿음을 추론하는 능력. '상대방이 무슨 생각을 하는지 파악하는 것'으로, 이를 활용하면 에이전트가 더 자연스럽게 협력 가능.

DAGDirected Acyclic Graph(방향 비순환 그래프). 작업 간 의존 관계를 화살표로 나타낸 구조로, 사이클이 없어서 순서대로 실행 가능. 멀티에이전트 오케스트레이션에서 작업 흐름을 표현할 때 자주 쓰임.

Cascading Hallucination한 에이전트가 잘못된 정보를 생성하면 다음 에이전트가 이를 사실로 받아들여 오류가 증폭되는 현상. 에이전트 체인이 길수록 위험.

MoEMixture of Experts(전문가 혼합). 여러 전문화된 서브모델(전문가)이 입력에 따라 경쟁하고, 게이팅 네트워크가 최적 전문가를 선택해 응답을 생성하는 구조.

Related Resources

Original Abstract (Expand)

With recent advances in Large Language Models (LLMs), Agentic AI has become phenomenal in real-world applications, moving toward multiple LLM-based agents to perceive, learn, reason, and act collaboratively. These LLM-based Multi-Agent Systems (MASs) enable groups of intelligent agents to coordinate and solve complex tasks collectively at scale, transitioning from isolated models to collaboration-centric approaches. This work provides an extensive survey of the collaborative aspect of MASs and introduces an extensible framework to guide future research. Our framework characterizes collaboration mechanisms based on key dimensions: actors (agents involved), types (e.g., cooperation, competition, or coopetition), structures (e.g., peer-to-peer, centralized, or distributed), strategies (e.g., role-based or model-based), and coordination protocols. Through a review of existing methodologies, our findings serve as a foundation for demystifying and advancing LLM-based MASs toward more intelligent and collaborative solutions for complex, real-world use cases. In addition, various applications of MASs across diverse domains, including 5G/6G networks, Industry 5.0, question answering, and social and cultural settings, are also investigated, demonstrating their wider adoption and broader impacts. Finally, we identify key lessons learned, open challenges, and potential research directions of MASs towards artificial collective intelligence.