Show HN: Baton – A desktop app for developing with AI agents
TL;DR Highlight
A desktop app that lets you run multiple AI coding agents (Claude Code, Gemini CLI, etc.) simultaneously in separate git worktrees and monitor them all in one place — ideal for developers who want to split work by feature and develop in parallel.
Who Should Read
Developers who want to run multiple AI coding agents like Claude Code or Codex CLI simultaneously and manage the progress of each task from a single interface. Especially suited for those who want to develop multiple features in parallel without branch conflicts.
Core Mechanics
- Baton is a desktop app for running and managing multiple AI coding agents simultaneously (supports all CLI-based agents including Claude Code, Codex CLI, OpenCode, and Gemini CLI), available as a free download for Mac, Windows, and Linux.
- Each task (workspace) is fully isolated via git worktree (a git feature that maintains multiple independent working directories within a single repository), so agents never interfere with or conflict with each other — each works on its own branch without needing to switch branches or use stash.
- The dashboard displays each agent's status with badges: a blue 'Input' badge when waiting for input, a green 'Done' badge when the task is complete, and a red 'Error' badge when an error occurs — no need to check each tab individually. Best support is provided for Claude Code.
- When starting a task, you describe what you want to build and the AI automatically generates a branch name, workspace title, and description. Enabling 'Accept Edits' mode lets the agent start working immediately without permission prompts.
- A built-in diff viewer based on Monaco editor (the code editor component used in VS Code) lets you review agent-made changes file by file before opening a PR, with the ability to roll back individual files. A 'Live follow mode' is also supported for tracking changes in real time while the agent is working.
- A built-in MCP (Model Context Protocol, the standard protocol for AI agents to call external tools) server allows agents to directly create new workspaces or launch parallel tasks during a conversation.
- Additional code review utilities are built in, including fuzzy file search and full-text content search powered by fzf and ripgrep, git blame, and per-file commit history. Frequently used shell commands or agent prompts can be saved as 'Actions' for reuse.
Evidence
- "There were criticisms that Baton's differentiators weren't clear given the large number of similar open-source agent managers emerging, with tools like Conductor, superset.sh, t3.codes, and cmux mentioned as alternatives — one commenter even noted that Claude Desktop itself has supported worktree-based parallel agents for over a month. There was also criticism that these agent managers are essentially rebuilding IDEs, with the argument that improving VS Code would be more practical since it already runs as a web app in containers, supports workspaces, and has an extension ecosystem (visualJJ, a worktree/workspace manager, was also mentioned). Practical questions arose about the cost of running multiple Claude Code agents simultaneously, with comments asking whether users were expensing it to their company — indicating that cost is a significant real-world barrier. More fundamental questions were raised about what people are actually building with agents, worktrees, and harnesses. Commenters shared that most use cases stay at the level of generating boilerplate components for frameworks like React or Laravel, or small personal apps, with one person describing using agents to remove dead code from large codebases as a time-saving task. There was also UX feedback about the site's design — one commenter said they gave up reading within 30 seconds due to a TV-noise background effect and flickering thin blue lines — and separately, someone shared a similar terminal-based tool they had built and published on GitHub (agent-storm)."
How to Apply
- "If you need to develop multiple features simultaneously with Claude Code, install Baton and create a workspace per feature — each agent works on its own independent git branch, enabling parallel development without conflicts, and you can review changes with the diff viewer and open a PR when done. If you find yourself constantly switching terminal tabs to check whether an agent has finished, use Baton's status badges and dock notifications — you'll be alerted the moment an agent reaches a completed, error, or input-waiting state, so you can check back while doing other work. If you have frequently used agent run options (e.g., flags like --dangerously-skip-permissions) or project initialization commands, save them with Custom Agent Presets and Workspace Setup so you don't have to re-enter them every time you create a new workspace."
Code Example
# Installing via AppImage on Linux
sudo apt install fuse libfuse2 # Debian/Ubuntu
sudo dnf install fuse fuse-libs # Fedora
chmod +x baton-*.AppImage
./baton-*.AppImage
# Verifying download integrity
# macOS
shasum -a 256 [file]
# Linux
sha256sum [file]
# Windows (PowerShell)
Get-FileHash [file] -Algorithm SHA256Terminology
Related Papers
Show HN: ctx – Search the coding agent history already on your machine
Claude Code, Cursor, Codex 등 코딩 에이전트가 이전 세션의 논의·결정·실패 시도를 잊지 않도록 SQLite로 인덱싱해 재사용할 수 있게 해주는 오픈소스 CLI 도구다.
Micro-Agent: Beat Frontier Models with Collaboration Inside Model API
vLLM 팀이 단일 모델 API 호출 뒤에서 여러 모델이 협업하는 'Micro-Agent' 개념을 공개했습니다. 별도의 에이전트 코드 없이 라우터 레이어에서 모델 조합을 실행해 GPT-4급 결과를 더 저렴하게 낼 수 있다는 아이디어입니다.
Ornith-1.0: self-improving open-source models for agentic coding
Gemma 4와 Qwen 3.5를 기반으로 파인튜닝한 코딩 특화 오픈소스 모델로, RL(강화학습)을 통해 스캐폴드(에이전트 실행 구조)까지 함께 최적화하는 방식을 주장하지만, 커뮤니티에서는 벤치마크 과최적화에 불과하다는 의심을 받고 있다.
Entity Binding Failures in Tool-Augmented Agents
AI 에이전트가 올바른 도구를 선택해도 잘못된 대상에 실행하는 'Entity Binding 실패' 문제를 정의하고, 이를 막는 실행 정책을 평가한 논문.
Herdr: Agent multiplexer that lives in your terminal
여러 AI 코딩 에이전트(Claude, Codex 등)를 하나의 터미널에서 동시에 실행·관리할 수 있는 Rust 기반 오픈소스 툴로, tmux처럼 세션이 유지되고 SSH로 원격 접속도 가능해 멀티 에이전트 워크플로우를 크게 단순화해준다.
Ornith-1.0: Self-scaffolding LLMs for agentic coding
모델이 문제 풀이 전략(scaffold)을 직접 생성하고 개선하는 자기강화 학습 프레임워크를 적용한 오픈소스 코딩 특화 LLM으로, 9B 소형 모델부터 397B 대형 모델까지 라인업을 갖추고 SWE-Bench 등 주요 벤치마크에서 Claude Opus 4.7을 능가하는 성능을 보여줬다.