

Sarthak Varshney is a Docker Captain, 5x C# Corner MVP, and 2x Alibaba Cloud MVP, with over six years of hands-on experience in the IT industry, specializing in cloud computing, DevOps, and modern application infrastructure. He is an Author and Associate Consultant, known for working extensively with cloud platforms and container-based technologies in real-world environments.
Let me be direct with you: the moment I first ran claude --dangerously-skip-permissions on my actual development machine, my stomach dropped a little. Not because I didn't trust the model — but because "dangerously" is right there in the flag name. One wrong prompt, one misunderstood context, and your project files could look very different very fast.
That's the exact problem Docker Sandboxes solves. And after spending time with the sbx CLI, I think this is genuinely one of the more important tools Docker has shipped for developers working with AI coding agents in 2025.
Docker Sandboxes is a product from Docker that lets you run AI coding agents — like Claude Code, Gemini CLI, GitHub Copilot CLI, Codex, Kiro, and OpenCode — inside isolated microVM environments. Each sandbox gets its own Docker daemon, its own filesystem, and its own network stack. The agent can do whatever it wants in there: install packages, build images, modify configs, run containers. Your host machine stays completely untouched.
Think of it as a disposable VM that spins up in seconds, does the AI's work, and gets thrown away when you're done.
The tool you interact with is called sbx. And importantly — it doesn't require Docker Desktop. This is its own standalone binary.
+-----------------------------+
| Your Host |
| +-----------------------+ |
| | microVM Sandbox | |
| | - Own Docker daemon | |
| | - Own filesystem | |
| | - Own network | |
| | - AI Agent running | |
| +-----------------------+ |
| Your files: untouched |
+-----------------------------+
This is the first question I hear from developers. We already have Docker containers for isolation — why add another layer?
The answer is the isolation model. Containers share the host kernel. If an agent does something destructive at the kernel level, it can still impact your system. A microVM, on the other hand, runs a completely separate OS kernel with a hardware boundary between it and your host.
This matters when you're running agents in --dangerously-skip-permissions mode (called YOLO mode), which is actually the default in Docker Sandboxes. In this mode, the agent never pauses to ask for permission before executing a tool. No approval prompts, no interruptions. The agent installs what it needs, modifies what it wants, commits its work — all without stopping to check with you.
Without a hard isolation boundary, YOLO mode on your real machine is exactly as dangerous as it sounds. Inside a sandbox, it's just fast.
The Docker team actually built their own VMM for this (they were asked about Firecracker in community discussions) — because Firecracker targets Linux server environments, and Docker wanted something that works on macOS, Windows, and Linux equally well.
Prerequisites are minimal but platform-specific.
macOS requires Apple Silicon (M1/M2/M3/M4) and macOS Tahoe (26) or later.
Windows requires 64-bit x86_64, Windows 11, and the HyperVisor Platform feature enabled:
Enable-WindowsOptionalFeature -Online -FeatureName HypervisorPlatform -All
Linux (Ubuntu) requires Ubuntu 24.04+, 64-bit x86_64, and KVM hardware virtualization. You can verify KVM availability like this:
$ lsmod | grep kvm
If you see kvm_intel or kvm_amd in the output, you're good. Then add your user to the kvm group:
$ sudo usermod -aG kvm $USER
$ newgrp kvm
# macOS
brew install docker/tap/sbx
# Windows
winget install -h Docker.sbx
# Linux (Ubuntu)
curl -fsSL https://get.docker.com | sudo REPO_ONLY=1 sh
sudo apt-get install docker-sbx
Then log in:
$ sbx login
This opens a browser for Docker OAuth. On first login, you'll be asked to choose a default network policy:
Choose a default network policy:
1. Open — All network traffic allowed, no restrictions.
2. Balanced — Default deny, with common dev sites allowed.
3. Locked Down — All network traffic blocked unless you allow it.
Balanced is what I recommend starting with. It allows traffic to npm, PyPI, GitHub, and common dev services while blocking everything else. You can always adjust individual rules later.
For Claude Code with a Claude Max, Team, or Enterprise subscription, no API key setup is required upfront. Just use /login inside the sandbox to authenticate with OAuth. The session token lives on your host and is injected by a proxy — it's never stored inside the sandbox itself.
For API key-based authentication (any agent, or if you prefer API keys for Claude Code):
$ sbx secret set -g anthropic
This stores the key in your OS keychain. A proxy on your host injects it into outbound requests, so the API key is never exposed inside the sandbox. Pretty clean security model.
To also give the agent GitHub access for creating pull requests:
$ sbx secret set -g github -t "$(gh auth token)"
This is the part I genuinely enjoy. Navigate to your project directory and run:
$ cd ~/my-project
$ sbx run claude
Replace claude with whatever agent you want — gemini, codex, copilot, kiro, opencode. The first run pulls the agent image (takes a minute). Subsequent runs reuse the cache and start in seconds.
You can check what's running:
$ sbx ls
SANDBOX AGENT STATUS PORTS WORKSPACE
claude-my-project claude running ~/my-project
Running sbx with no arguments opens an interactive terminal dashboard showing all your sandboxes with live CPU and memory usage. From there you can attach to agents, open shells, and manage network rules without touching the CLI.
There are two ways the agent can interact with your repository.
The agent edits your working tree directly. If you're working on something where you want instant visibility into what the agent is doing, this works fine. Just stage, commit, and push as you normally would.
This is what I use for anything non-trivial. Pass --branch to give the agent its own Git worktree:
$ sbx run claude --branch my-feature
This creates a Git worktree under .sbx/ in your repository root. The agent works on its own branch and directory — completely separate from your main working tree. You're free to keep coding on main while the agent works on its branch.
my-project/
├── src/ ← your main working tree
├── .sbx/
│ └── claude-my-project-worktrees/
│ └── my-feature/ ← agent works here
└── .gitignore
When the agent finishes, review the work:
$ cd .sbx/claude-my-project-worktrees/my-feature
$ git log
$ git diff main
If you're happy with it, push and open a PR:
$ git push -u origin my-feature
$ gh pr create
You can also let sbx generate the branch name automatically:
$ sbx run claude --branch auto
And if you want multiple agents working on the same repo in parallel, branch mode keeps them isolated — each gets its own worktree:
$ sbx run --branch feature-a my-sandbox
$ sbx run --branch feature-b my-sandbox
Add .sbx/ to your .gitignore to keep your git status clean:
$ echo '.sbx/' >> .gitignore
One of the more practical things about sbx is the network governance. If an agent tries to reach something blocked by your policy, it'll fail — which surfaces the issue clearly instead of silently.
Check current rules:
$ sbx policy ls
Allow a specific host:
$ sbx policy allow network registry.npmjs.org
If you're running a local service on your host (say, an Ollama server at port 11434), reach it from inside the sandbox using host.docker.internal:
$ sbx policy allow network localhost:11434
$ curl http://host.docker.internal:11434
Sandboxes are network-isolated by default, so if your agent starts a dev server, you can't just open localhost:3000. Use sbx ports to forward traffic:
$ sbx ports my-sandbox --publish 8080:3000
$ open http://localhost:8080
One thing worth knowing: services inside the sandbox must bind to 0.0.0.0, not 127.0.0.1. Most dev servers default to loopback, so you'll need to pass something like --host 0.0.0.0 when starting them. Port mappings also don't persist across sandbox restarts — you'll need to re-publish.
This is where Docker Sandboxes gets really powerful for teams and repeated workflows.
A template is a Docker image baked ahead of time. If your project needs .NET, Rust, or some non-default toolchain, build a Dockerfile that extends the default sandbox template, push it to a registry, and use it with --template:
$ sbx run claude --template ghcr.io/myorg/my-dotnet-sandbox:1.0
Templates are ideal for heavy dependencies — things you don't want to reinstall every single sandbox session.
A kit is a YAML artifact applied at runtime. It can:
# Use a kit from a local directory
sbx run claude --kit ./my-kit/
# Use a kit from a git repo with a specific tag
sbx run claude --kit "git+https://github.com/docker/sbx-kits-contrib.git#ref=v0.1.0&dir=code-server"
# Use a kit from an OCI registry
sbx run claude --kit ghcr.io/myorg/my-kit:1.0
The key distinction: templates bake things into the image (installed once, reused). Kits apply things at startup (flexible, per-run customization). They're designed to work together — a heavy template as the base, thin kits layered on top for per-project or per-team config.
One thing the community has flagged (and Docker has acknowledged): kits currently don't support automatic port publishing or user-configurable parameters. Those are on the roadmap.
Sandboxes persist after the agent exits. Stop one without deleting it:
$ sbx stop my-sandbox
When you're done:
$ sbx rm my-sandbox
This deletes everything inside the sandbox — packages, Docker images, branch mode worktrees — but your actual project files on the host are untouched.
Want to show someone why microVM isolation matters? Skip the architecture diagrams. Open a sandbox and run this:
$ sudo rm -rf *
Watch every file disappear. Then type exit. Open your project folder on the host.
Everything is exactly where you left it.
That single moment does more explaining than any slide deck. The sandbox absorbed a catastrophic command and your machine never flinched. That's the whole point of Docker Sandboxes — the agent gets full freedom inside, and you get a hard guarantee that none of it reaches your host.
Here's my honest take as someone who works with containers daily:
Use Docker Sandboxes when:
You probably don't need it when:
This comes up a lot: Docker Sandboxes is not tied to Docker Desktop licensing. sbx is a separate binary. Docker's position here is that of a neutral platform — sbx works with Claude, Gemini, Copilot, Codex, Kiro, and OpenCode. No vendor lock-in on the agent side.
The official documentation lives at https://docs.docker.com/ai/sandboxes/ and covers everything I've touched on here, plus deeper dives into:
sbx commands and flags)The sbx-releases GitHub repository is also where you can file bug reports and follow the project.
The developer workflow is shifting fast. AI coding agents are no longer just autocomplete — they're writing functions, refactoring modules, running tests, and committing code. Giving them the freedom to work efficiently and keeping your environment safe used to be a tradeoff. Docker Sandboxes takes that tradeoff off the table.
The isolation is real, the workflow is clean, and the Git branch mode is genuinely useful for teams. If you're already using Claude Code or any other AI coding agent, this is worth setting up this week.