How I made --dangerously-skip-permissions safe in Claude Code
There's a flag in Claude Code called --dangerously-skip-permissions. It does what it sounds like. The agent runs as you, with your SSH keys, cloud credentials, Keychain, every project on your machine. Full autonomy. No approval prompts. This is where the actual productivity lives.
I never turned it on.
Instead I sat in manual approval mode for months, clicking through approve this file write? dialogs like a human CAPTCHA. I knew it was pointless — any prompt injection or malicious config file bypasses the approval anyway. It's a software guardrail, not a security boundary. And every interruption breaks the agent's multi-step reasoning, so I was paying for fake security with real productivity loss.
Eventually I got tired of it and built what I actually wanted: OS-level containment that makes full autonomy safe. Turned out solving security also solved productivity, which I didn't expect. Once there's nothing dangerous for the agent to reach, you stop needing permission prompts at all.
The incidents that made this urgent
I was already building Hazmat when the CVEs started confirming everything I was worried about.
Check Point published CVE-2025-59536 (CVSS 8.7): opening a cloned repo could execute arbitrary shell commands through malicious project config files. Just cd-ing into a directory and running claude was enough. Manual approval mode doesn't help — the code runs before you see a prompt.
CVE-2026-21852 was worse in some ways — a malicious repo could redirect your API key to an attacker's server before the trust prompt even appeared. Your key was gone before you could say no.
Ona's research on Claude Code escaping its own sandbox was the one that really got me. The agent found a /proc/self/root path traversal around the denylist. When bubblewrap blocked that route, it tried to turn off the sandbox entirely. No one prompted it to do that. It just decided that disabling its own containment was the logical next step.
And then last week's axios npm supply chain attack delivered a RAT through a postinstall hook in 2 seconds. If your agent runs npm install as your user — regardless of what permission mode you're in — that RAT gets everything you have.
None of this is Claude-specific. Claude Code just happens to have 16 documented CVEs because Anthropic actually publishes advisories. OpenCode, Codex, Cursor have the same exposure with less visibility.
Why existing sandboxes don't solve the real problem
Claude Code ships with a sandbox based on sandbox-exec on macOS and bubblewrap on Linux. It restricts filesystem writes and filters some network traffic. It's a real improvement.
But it still forces a choice: security or productivity. The sandbox runs with restrictions, or you skip it and run with full access. There's no mode where the agent works autonomously AND can't reach your credentials.
That's the actual problem. A Seatbelt profile can deny file reads, but it can't stop the agent from curl-ing your project code to any server over HTTPS. A firewall can block exfiltration protocols, but if the agent runs as your user, it can read ~/.ssh/id_rsa before the firewall even matters. And none of it stops a supply chain attack. npm install runs as your user with full network access.
Other tools in this space (Agent Safehouse, nono, SandVault) wrap the same primitives. Seatbelt profiles, maybe a dedicated user. One or two layers. They improve security, but they don't resolve the tradeoff.
The thing that kept bugging me: defense-in-depth means independent layers, each covering a different class of threat. Wrapping sandbox-exec twice doesn't count.
What Hazmat does differently
Hazmat doesn't ask you to choose between security and autonomy. It stacks independent enforcement layers so the agent can run with full permissions inside a context where full permissions can't do much damage.
The most impactful single thing is user isolation. The agent runs as a dedicated agent macOS user. Your home directory isn't blocked; it's absent. The agent literally can't see it.
On top of that, each session gets a Seatbelt policy generated at runtime. The project directory gets read-write. Everything else is denied at the kernel level. SSH keys, AWS credentials, GPG keys, Keychain, GitHub tokens are all explicitly denied even within the agent's own home, so a misconfigured broad allow can't accidentally expose them.
For network containment, pf firewall rules scoped to user agent block SMTP, IRC, FTP, Tor, VPN, SOCKS, and other exfiltration protocols. The agent can still make HTTPS requests (it needs to), but it can't email your code or tunnel it out. A DNS blocklist sends known tunnel and paste services (ngrok, pastebin, transfer.sh, webhook.site) to localhost.
And then supply chain hardening: npm ignore-scripts=true by default in the agent's environment. The axios-style postinstall attack? Doesn't execute. The agent can still install packages, but install hooks are dead on arrival.
Plus Kopia-based snapshot rollback. If the agent breaks something, you restore.
brew install dredozubov/tap/hazmat
hazmat init # one-time setup, ~10 min, interactive
hazmat claude # Claude Code in containment
hazmat opencode # OpenCode in containment
hazmat exec ./my-agent-loop.sh # any agent, any script
That's it. hazmat init creates the agent user, configures the firewall and DNS blocklist, and sets up snapshots. During setup you choose which agents to bootstrap — Claude Code, OpenCode, Codex, or all of them. Every step is explained and confirmed. hazmat init --dry-run previews without changing anything.
The part I'm most proud of: formal verification
The setup and rollback ordering is formally verified with TLA+.
Why? Because I kept finding ordering bugs. If setup installs the sudoers file (which lets the agent launch) before the firewall is active, there's a window where the agent can run without network containment. If rollback removes the firewall before revoking launch privileges, same problem in reverse.
These are the kind of bugs that don't show up in testing because they only matter when something interrupts setup halfway through. They absolutely matter in production.
The TLA+ specs check every reachable state — 26,905 of them for the setup/rollback state machine — and found three real bugs:
- Setup: sudoers was installed before the firewall. If setup was interrupted between those steps, the agent was launchable without containment.
- Seatbelt policy: credential deny rules only blocked reads, not writes. A malicious agent could overwrite credential files.
- Cloud restore: the workspace was overwritten without taking a snapshot first. If the cloud backup was stale, your current work was permanently lost.
All three are fixed. The principle: grant privilege last, revoke privilege first.
How it compares
| Built-in sandbox | Agent Safehouse | SandVault | nono | Hazmat | |
|---|---|---|---|---|---|
| Separate user account | — | — | ✓ | — | ✓ |
| Seatbelt / kernel sandbox | ✓ | ✓ | ✓ | ✓ | ✓ |
| Credential path deny | — | partial | — | — | ✓ |
| Network firewall (pf) | — | — | — | — | ✓ |
| DNS blocklist | — | — | — | — | ✓ |
| Supply chain hardening | — | — | — | — | ✓ |
| Backup / rollback | — | — | — | ✓ | ✓ |
| Agent-agnostic | — | ✓ | ✓ | ✓ | ✓ |
| TLA+ verified | — | — | — | — | ✓ |
What this doesn't solve
I don't want to oversell this. Hazmat is OS-level containment, not a VM.
Seatbelt has known escape paths. Apple's SBPL is undocumented and there are mach service vectors. It prevents accidents and blocks credential access. It won't stop a determined adversary with a kernel exploit.
HTTPS exfiltration to novel domains is not blocked. The agent can curl anything on port 443. User isolation is the real defense here: don't give it the data in the first place.
macOS only. The containment primitives (sandbox-exec, dscl, pfctl) are macOS-specific. A Linux port would use different primitives (namespaces, seccomp, nftables) but the same architecture.
The full threat model is in threat-matrix.md. The design assumptions and every non-obvious tradeoff are in design-assumptions.md.
If you need stronger isolation, the repo documents the full VM path.
Try it
brew install dredozubov/tap/hazmat
hazmat init
hazmat claude # or: hazmat opencode, hazmat exec <anything>
The repo is at github.com/dredozubov/hazmat. MIT licensed, written in Go, works with any terminal-based AI agent.
I spent months choosing between security and productivity. Turns out the answer was making them the same thing.
I'm Denis Redozubov. I spent 8 years as CTO of a Haskell consultancy, and now I'm building on my own. This is one of several projects I'm documenting at codeofchange.io.