Launch HN: Freestyle – Sandboxes for Coding Agents
TL;DR Highlight
Sandbox infrastructure designed to allow AI coding agents to run tens of thousands of VMs concurrently, with core features including VM startup within 700ms, forking (cloning) of running VMs, and Pause/Resume functionality.
Who Should Read
Engineers developing services where AI directly generates and executes code, such as Devin, Cursor Agent, Lovable, and Bolt, or backend developers operating AI code review bots or test automation in CI/CD.
Core Mechanics
- Freestyle is a VM Sandbox service dedicated to AI coding agents, providing an immediate startup time of under 700ms from API request to VM readiness.
- The most differentiating feature is 'Live Forking,' which allows you to clone a running VM entirely without stopping it. For example, you can fork a single VM into three and assign 'API endpoint implementation,' 'frontend UI implementation,' and 'test suite writing' to the AI in parallel.
- Pause & Resume functionality allows you to pause a VM, resulting in zero cost, and resume it exactly where it left off when the next execution request arrives. With a `idleTimeoutSeconds: 60` setting in the code example, it automatically pauses after 60 seconds of inactivity.
- Unlike competing services that only fork the filesystem, Freestyle explicitly states that it forks the entire VM memory (RAM state). This enables agents to explore multiple directions from an intermediate execution state (branch-and-explore).
- It also includes built-in Git repository management, allowing agents to store generated code in Freestyle's own Git repo, synchronize bidirectionally with GitHub, and configure Webhooks with fine-grained control by branch, path, and event type.
- In a code review bot use case, `bun run lint` and `bun test` are executed in the VM, after which the AI reviews the diff and automatically posts 'REQUEST_CHANGES' or 'APPROVE' to the GitHub PR depending on test failures.
- The infrastructure is built on top of its own bare-metal servers to reduce cloud virtualization overhead and supports low-level network features like eBPF and XDP. It was stated that the Sandbox is isolated outside the main VPC for security.
- The JS Sandbox API was already available, and this launch adds a full VM-based Sandbox. It natively supports Node.js/Bun runtimes and automatic execution of development servers (`bun run dev`).
Evidence
- The most interest was in the memory forking feature. One comment stated that forking the entire VM memory during runtime is a different approach than competitors copying only the filesystem, and expressed hope that if implemented with Copy-on-Write, the complexity would be O(1) and costs would not increase regardless of machine size.
- There was a comment from a team operating thousands of Sandboxes on Azure, GCP, and AWS using standard VMs, who were unclear what Freestyle offers compared to standard VMs. A key question was whether the forking feature requires the agent code to be modified to recognize forking, or if it operates transparently.
- Several comments requested comparisons with competitors. E2B, Daytona, Modal, Blaxel, Vercel, Cloudflare, and Fly Sprites were mentioned, and there were many requests for a price and performance comparison matrix.
- There was criticism that the 50 concurrent VM limit is low. A team that built a similar service in-house shared that maintaining a warm pool of Firecracker VMs allows for immediate Sandbox provisioning without boot time.
- The lack of Windows support was mentioned. Currently, all Sandbox platforms, including Freestyle, are Linux-only, creating a gap for automating enterprise software (ERP, etc.) workflows that require Windows.
How to Apply
- When creating services that automatically generate apps with AI, like Lovable, Bolt, and V0, you can create a template repo with `freestyle.git.repos.create()` and set up `VmDevServer` to configure an environment where the development server automatically starts as soon as the AI generates code, all through API calls.
- When you want to process a single task in parallel, like with Devin and Cursor Agent, you can clone the running VM with `vm.fork({ count: 3 })` and assign different tasks to each fork simultaneously using `Promise.all` to significantly reduce the overall task time.
- When adding an AI code review bot to a GitHub PR, you can generate an AI review based on the results of running `vm.exec('bun run lint')` and `vm.exec('bun test')` in the VM, and then conditionally post 'REQUEST_CHANGES' if the tests fail, creating a CI-integrable automated review pipeline.
- When operating an AI coding assistant that interacts with users, setting `persistence: { type: 'persistent' }` and `idleTimeoutSeconds: 60` will eliminate costs during idle periods between conversations and automatically resume the VM in its previous state when the next message arrives, minimizing costs while maintaining session state.
Code Example
// Parallel agent forking example (Devin, Cursor Agent style)
import { freestyle } from "freestyle-sandboxes";
import { VmBun } from "@freestyle-sh/with-bun";
const { vm } = await freestyle.vms.create({
git: {
repos: [
{ repo: "https://github.com/user/repo.git" },
]
}
});
// Clone the running VM into 3 copies
const { forks } = await vm.fork({ count: 3 });
// Assign different tasks to each fork in parallel
await Promise.all([
ai(forks[0], "Build the API endpoints"),
ai(forks[1], "Build the frontend UI"),
ai(forks[2], "Write the test suite"),
]);
// AI code review bot example
const { stdout: lint } = await vm.exec("bun run lint");
const { stdout: test } = await vm.exec("bun test");
const review = await ai(vm, "Review the diff for bugs");
await github.pulls.createReview({
body: review,
event: test.includes("FAIL") ? "REQUEST_CHANGES" : "APPROVE",
});
// Persistent VM + automatic Pause example
const { vm: persistentVm } = await freestyle.vms.create({
persistence: { type: "persistent" },
idleTimeoutSeconds: 60, // Automatically pauses after 60 seconds of inactivity, cost 0
});
while (true) {
const userMessage = await getNextMessage();
const result = await ai(persistentVm, userMessage);
await respond(result);
}Terminology
Related Papers
Show HN: adamsreview – better multi-agent PR reviews for Claude Code
Claude Code에서 최대 7개의 병렬 서브 에이전트가 각각 다른 관점으로 PR을 리뷰하고, 자동 수정까지 해주는 오픈소스 플러그인이다. 기존 /review나 CodeRabbit보다 실제 버그를 더 많이 잡는다고 주장하지만 커뮤니티에서는 복잡도와 실효성에 대한 회의론도 나왔다.
How Fast Does Claude, Acting as a User Space IP Stack, Respond to Pings?
Claude Code에게 IP 패킷을 직접 파싱하고 ICMP echo reply를 구성하도록 시켜서 실제로 ping에 응답하게 만든 실험으로, 'Markdown이 곧 코드이고 LLM이 프로세서'라는 아이디어를 네트워크 스택 수준까지 밀어붙인 재미있는 사례다.
Show HN: Git for AI Agents
AI 코딩 에이전트(Claude Code 등)가 수행한 모든 툴 호출을 자동으로 추적하고, 어떤 프롬프트가 어느 코드 줄을 작성했는지 blame까지 가능한 버전 관리 도구다.
Principles for agent-native CLIs
AI 에이전트가 CLI 도구를 더 잘 사용할 수 있도록 설계하는 원칙들을 정리한 글로, 에이전트가 CLI를 도구로 활용하는 빈도가 높아지면서 이 설계 방식이 실용적으로 중요해지고 있다.
Agent-harness-kit scaffolding for multi-agent workflows (MCP, provider-agnostic)
여러 AI 에이전트가 서로 역할을 나눠 협업할 수 있도록 조율하는 scaffolding 도구로, Vite처럼 설정 없이 빠르게 멀티 에이전트 파이프라인을 구성할 수 있다.
Show HN: Tilde.run – Agent sandbox with a transactional, versioned filesystem
AI 에이전트가 실제 프로덕션 데이터를 건드려도 롤백할 수 있는 격리된 샌드박스 환경을 제공하는 도구로, GitHub/S3/Google Drive를 하나의 버전 관리 파일시스템으로 묶어준다.