Should AI agents run in a sandbox?
Should AI agents run in a sandbox?
Section titled “Should AI agents run in a sandbox?”Quick answer
Section titled “Quick answer”Often yes, but not always for the same reason.
AI agents should run in a sandbox when they can:
- execute code,
- browse untrusted pages,
- read or write files,
- call tools with side effects,
- or touch secrets, networks, and system resources.
The sandbox is there to contain execution risk. It does not decide what the agent should be allowed to do in the first place.
The wrong sandbox question
Section titled “The wrong sandbox question”The weak question is:
“Do we trust the model?”
That is not the right control question.
The better question is:
“If this run goes wrong, what boundary stops it from becoming a larger systems incident?”
That is where sandboxing becomes useful.
When sandboxing is clearly required
Section titled “When sandboxing is clearly required”Sandboxing is usually mandatory when the agent can:
- run generated code,
- inspect or transform local files,
- browse arbitrary websites,
- use developer tools,
- or operate inside engineering environments where accidental writes, secret exposure, or network access create real damage.
Without isolation, one weak run can become a much larger control failure.
When lighter containment may be enough
Section titled “When lighter containment may be enough”Not every agent needs heavy execution isolation.
If the agent only:
- drafts text,
- summarizes evidence,
- routes requests,
- or proposes actions that still require separate human approval,
then narrower permission design and application-owned controls may matter more than a full sandbox runtime.
The right level of isolation depends on what the agent can actually touch.
What sandboxing protects well
Section titled “What sandboxing protects well”Sandboxing is strong when it limits:
- filesystem scope,
- network reach,
- process execution,
- credential exposure,
- and the blast radius of bad tool behavior.
It is especially valuable when the model can encounter untrusted input and then choose actions.
What sandboxing does not solve
Section titled “What sandboxing does not solve”Sandboxing does not solve:
- bad approval policy,
- broad business permissions,
- weak audit logs,
- unsafe user-scoped authority,
- or a workflow that should never have been autonomous.
Teams often overestimate sandboxing because it feels concrete. But a sandbox around an overpowered workflow is still an overpowered workflow.
The healthy operating pattern
Section titled “The healthy operating pattern”The healthy production pattern is usually:
- narrow permissions first,
- sandbox execution second,
- approval and escalation for irreversible actions,
- logging and evaluation after every important run.
This is why sandboxing belongs inside a broader control plane, not as a standalone safety story.
The practical rule
Section titled “The practical rule”Use strong sandboxing when the agent can execute, browse, or mutate technical systems directly.
Use lighter isolation when the agent remains in read-only or draft-only lanes and business controls already contain the output.
If the team cannot explain the blast radius of a failed run, the system probably needs stronger isolation than it has.
Implementation checklist
Section titled “Implementation checklist”Your sandboxing decision is probably healthy when:
- the team can describe exactly which resources the agent can touch;
- execution, filesystem, network, and secret boundaries are explicit;
- sandboxing is paired with approval and permission design;
- the logs can show what happened inside the boundary;
- and the team knows which actions should still be impossible even inside the sandbox.