

Sarthak Varshney is a Docker Captain, 5x C# Corner MVP, and 2x Alibaba Cloud MVP, with over six years of hands-on experience in the IT industry, specializing in cloud computing, DevOps, and modern application infrastructure. He is an Author and Associate Consultant, known for working extensively with cloud platforms and container-based technologies in real-world environments.
Let me set the scene.
You're experimenting with Claude Code or some other AI coding agent, and you want it to actually do things — write files, install packages, spin up containers, run tests. But there's a problem: if you just let it loose on your machine, it has access to... everything. Your SSH keys, your production configs, your entire Docker daemon with all its containers.
That's a little terrifying, right?
Docker Sandboxes is Docker's answer to this. It's a way to give AI agents a completely isolated environment to work in — their own mini Linux machine with their own Docker daemon — without those agents ever touching your actual system. They can go wild inside the sandbox: build images, install packages, run containers. When you're done, you delete the sandbox and everything inside it vanishes.
The key thing that makes this different from just running an agent in a container: each sandbox is a full microVM (a lightweight virtual machine), not just a container. That means proper hardware-level isolation, not just Linux namespace tricks.
Quick note so you're not confused later:
This article mostly covers the new microVM approach, which is where things are heading.
Think of it like this. On your Mac, you've got Docker Desktop running. Inside your Docker Desktop environment, instead of just having containers, you can spin up small virtual machines — microVMs. Each of these microVMs has its own Linux kernel, its own Docker daemon, and its own isolated filesystem.
When you ask Docker Sandboxes to run an AI agent, it:
Here's what the isolation looks like inside:
Your Mac (host)
├── Your Docker containers and images (agent CAN'T see these)
│
├── Sandbox VM 1 (for project-a)
│ ├── Private Docker daemon
│ ├── The AI agent (Claude Code, Gemini, etc.)
│ └── Containers the agent created
│
└── Sandbox VM 2 (for project-b)
├── Private Docker daemon
└── The AI agent
The two VMs can't talk to each other either. Each one is its own walled garden.
The main command you'll use is docker sandbox run. Here's the simplest possible invocation:
docker sandbox run claude ~/my-project
That's it. One command. Docker will:
~/my-project into it at the exact same pathWant to see what's running?
docker sandbox ls
Output looks something like:
SANDBOX AGENT STATUS PORTS WORKSPACE
claude-my-project claude running /Users/you/my-project
When you're done:
docker sandbox stop claude-my-project # pause it (keeps everything intact)
docker sandbox rm claude-my-project # delete it completely
If you just run sbx with no arguments, you get a full terminal dashboard — think of it like htop but for your sandboxes. It shows all your sandboxes as cards with live status, CPU, and memory.
sbx
From the dashboard you can:
c to create a new sandboxs to start or stop oneEnter to attach to an agent sessionx to open a shell inside the sandboxr to remove a sandboxTab to switch to the network panel (where you can see what URLs the agent is calling and allow/block them)This is something worth understanding properly because it affects your Git workflow.
By default, the agent edits your actual working tree. Whatever it changes, you'll see it in git status on your host. This is simple and intuitive — same as if you did the edits yourself.
docker sandbox run claude ~/my-project
# Agent edits files directly in ~/my-project
# You can git commit, push, etc. from your terminal as usual
The catch: if you run two agents on the same repo at the same time, they'll both be editing the same files and will conflict.
Pass --branch to give the agent its own Git worktree and branch. This is the safer approach when you want to run multiple agents in parallel, or when you want to review the agent's work before merging it.
docker sandbox run claude --branch feature-ai-refactor ~/my-project
What this does behind the scenes: Docker creates a Git worktree under .sbx/ in your repo root, creates the branch feature-ai-refactor, and the agent works entirely in that worktree. Your main working directory is untouched.
You can even use --branch auto to let Docker pick a name:
docker sandbox run claude --branch auto ~/my-project
To review what the agent did and push it:
git worktree list # find the worktree
cd .sbx/claude-my-project-worktrees/feature-ai-refactor
git log # see commits
git push -u origin feature-ai-refactor
gh pr create # open a PR if you want
Clean up: when you docker sandbox rm, it deletes the worktrees and branches too.
Pro tip — add .sbx/ to your .gitignore so it doesn't show up in git status:
echo '.sbx/' >> .gitignore
Running separate projects? No problem. Each gets its own isolated sandbox:
docker sandbox run claude ~/project-a
docker sandbox run claude ~/project-b
You can also mount extra directories into a single sandbox — useful for shared libraries or reference code the agent should be able to read:
docker sandbox run claude ~/project-a ~/shared-libs:ro ~/docs:ro
The :ro at the end means read-only. The agent can see those directories but can't modify them.
Here's something that trips people up: the sandbox is network-isolated, so if the agent starts a development server inside it, your browser can't reach it by default.
You need to explicitly forward ports from your host to the sandbox:
docker sandbox ports my-sandbox --publish 8080:3000
# Now localhost:8080 on your host → port 3000 inside the sandbox
open http://localhost:8080
Or let the OS pick a free port:
docker sandbox ports my-sandbox --publish 3000
docker sandbox ports my-sandbox # check what host port got assigned
The docker sandbox ls output shows active port mappings:
SANDBOX AGENT STATUS PORTS WORKSPACE
my-sandbox claude running 127.0.0.1:8080->3000/tcp /home/you/proj
To stop forwarding:
docker sandbox ports my-sandbox --unpublish 8080:3000
One important gotcha: most dev servers default to listening on 127.0.0.1 inside the sandbox, not 0.0.0.0. If a port forward isn't working, that's usually why. Start the server with something like --host 0.0.0.0 or --host 0.0.0.0 to make it reachable.
Also note: port forwards don't survive a sandbox restart. You'll need to re-publish them.
You might wonder why Docker went with full VMs instead of just containers. The reason is Docker-in-Docker.
AI coding agents need to do Docker things — build images, run docker compose up, pull images. If you put the agent in a regular container, it can't safely run Docker inside that container without either sharing your host Docker daemon (too risky) or using complex nested Docker setups that require privileged mode.
MicroVMs solve this cleanly. Each sandbox has a full Linux kernel and a completely private Docker daemon. The agent can docker build and docker compose up to its heart's content, using resources that are 100% inside the VM.
The tradeoff is that each sandbox uses more resources than a plain container — you're spinning up a small VM, not just a process. But for the isolation benefits when running autonomous agents, Docker considers this worth it.
Your workspace folder is synchronized between your Mac and the VM using bidirectional file sync (not volume mounting). This means:
/Users/alice/projects/myapp in both places)That last point is actually really helpful when debugging — error messages and stack traces showing file paths will match what you see on your host.
While a sandbox exists, these things survive across stop/start:
apt, npm, etc.When you docker sandbox rm, all of that is gone except your workspace files (which were synced to your host the whole time).
Docker Sandboxes works with several AI coding agents. Claude Code has the most complete support, with others in various stages:
| Agent | Support Status |
|---|---|
| Claude Code (Anthropic) | Full support |
| Codex (OpenAI) | Partial — in development |
| GitHub Copilot | Partial — in development |
| Gemini (Google) | Partial — in development |
| Kiro (AWS) | Partial — in development |
| Docker Agent (cagent) | Partial — in development |
The basic invocation is the same regardless of agent — just swap claude for the agent name:
docker sandbox run claude ~/my-project
docker sandbox run gemini ~/my-project
docker sandbox run codex ~/my-project
Let's be specific about what the VM boundary does and doesn't protect.
What the agent CAN'T access:
localhost services (VM boundary prevents this)What the agent CAN do:
Network filtering: There's an HTTP/HTTPS filtering proxy that runs on your host. Sandboxes route their web traffic through this proxy, which means you can control what external URLs the agent is allowed to hit. You manage this from the network panel in the sbx dashboard (press Tab to get there).
Credentials: The sandbox does not automatically inherit your host SSH keys, cloud credentials, or API keys. You need to explicitly configure what the agent should have access to.
Important note for teams: If your Docker Desktop is managed by an administrator using Settings Management, they need to specifically allow experimental features before you can use sandboxes at all.
The CLI plugin isn't installed or isn't being detected. Check:
ls -la ~/.docker/cli-plugins/docker-sandbox
If the file is missing, you need to install the plugin. If it exists but the command isn't working, try restarting Docker Desktop.
Your Docker Desktop is managed by an IT/admin policy that locks experimental features. Ask your admin to enable them in the Settings Management config:
{
"configurationFileVersion": 2,
"allowBetaFeatures": {
"locked": false,
"value": true
}
}
Claude Code can't authenticate or throws API key errors. Usually means the key is invalid, expired, or not configured. Double-check your Anthropic API key is correct and has credits.
You have a .claude.json file in your workspace with a primaryApiKey field, which conflicts with how sandboxes handle credentials. Remove the primaryApiKey field and let the sandbox manage authentication:
{
"apiKeyHelper": "/path/to/script",
"env": {
"ANTHROPIC_BASE_URL": "https://api.anthropic.com"
}
}
The workspace path isn't accessible to Docker's file sharing. On Docker Desktop:
Then verify the permissions on your files:
ls -la ~/my-project
chmod -R u+r ~/my-project # if needed
Windows has a limitation where launching too many sandboxes simultaneously can crash things. To recover:
Ctrl+Shift+Esc)docker.openvmm.exe processes and end themGoing forward, create sandboxes one at a time instead of all at once.
Nuclear option — reset everything:
docker sandbox reset
This stops all running VMs and wipes all sandbox data. The Docker daemon keeps running. Your workspace files on your host are untouched. After this you'll create sandboxes fresh.
Mistake 1: Looking for the sandbox in docker ps
The sandbox is a VM, not a container. It won't show up in docker ps. Use docker sandbox ls instead.
Mistake 2: Expecting port forwards to survive a restart
They don't. Published ports are lost when the sandbox stops. You need to re-run docker sandbox ports ... --publish after starting the sandbox again.
Mistake 3: Assuming the agent has your SSH keys or AWS credentials
By default it doesn't. The sandbox doesn't inherit your host credentials. You need to set up credential access explicitly.
Mistake 4: Editing files in both the host and the sandbox at the same time without branch mode
If you and the agent are both editing the same files simultaneously in direct mode, you'll get conflicts. Use --branch mode if you want to work in parallel.
Mistake 5: Forgetting to add .sbx/ to .gitignore
If you use branch mode, Docker creates a .sbx/ directory in your repo. It'll show up in git status and confuse you. Just add it to .gitignore early.
Here's something to try if you want to get hands-on:
sandbox-test on your machinehello.py — that just prints "Hello from sandbox!"docker sandbox run claude ~/sandbox-testdocker sandbox ports <sandbox-name> --publish 8080:3000 to forward the porthttp://localhost:8080 in your browser~/sandbox-testdocker sandbox rm <sandbox-name> and verify the Flask container inside the sandbox is goneIf you pull that off, you'll have experienced the full sandbox lifecycle — creation, agent autonomy, port forwarding, file sync, and cleanup.
Docker Sandboxes is one of those tools that solves a genuinely annoying problem. AI coding agents are useful, but letting them run directly on your machine is uncomfortable. MicroVM isolation gives you the best of both worlds: the agent gets full Docker capabilities to work with, and your system stays protected.
It's still experimental, so expect some rough edges. But the core workflow is solid — one command to start, one command to clean up, and your machine stays safe in between.
The docs are at docs.docker.com/ai/sandboxes if you want to dig deeper into network policies, custom templates, or any of the other features we didn't cover here.