Trust boundaries
Primary boundary: AI agent to shell
The core trust boundary sits between the AI coding agent and the host shell. CC Safety Net is the gatekeeper.- Untrusted side — command strings generated by AI agents. These are treated as potentially hostile because agents can be manipulated via prompt injection, confused context, or adversarial instructions into producing destructive commands.
- Trusted side — the host shell where commands would execute.
Secondary boundaries
Four secondary boundaries cross into CC Safety Net from an external source. Each one is validated before it can influence analysis.| Boundary | Source | How it is validated |
|---|---|---|
| User configuration | Custom rules in JSON files on disk | Parsed and schema-validated; malformed rules fail closed |
| Rulebook sources | Rulebooks fetched from GitHub or local directories | Remote rulebooks integrity-checked via SHA-256 digests in the lockfile |
| Hook input JSON | Each agent’s JSON payload on stdin | Parsed defensively; malformed JSON triggers a deny |
| Environment variables | Mode flags and path overrides (CC_SAFETY_NET_*, TMPDIR, etc.) | Read explicitly; security-critical values treated as untrusted |
Fail-closed enforcement
Fail-closed is the foundational safety property: when analysis fails, config is invalid, or input cannot be parsed, the command is blocked rather than allowed. It is enforced at every entry point.Hook entry points
The hook adapter wraps the analysis call in a try/catch. If analysis throws, the hook emits a deny decision with a “failed closed” reason instead of letting the command proceed. This applies to every stdin-based hook agent (Claude Code, Gemini CLI, Copilot CLI, Kimi Code).
Plugin and extension entry points
The OpenCode plugin and Pi extension apply the same pattern — analysis errors are caught and re-surfaced as block messages so the platform treats them as denied commands.
Strict mode
Strict mode extends fail-closed to commands the shell parser cannot safely tokenize, so unparseable input is blocked rather than passed through.
Secret redaction
Before any command or segment text is written to the audit log or returned to the agent, it passes through automatic secret redaction. The redactor scrubs PEM private keys, database URL environment variables, generic secret-bearing env assignments, common secret HTTP headers, URL credentials, and known provider token prefixes (GitHub, Slack, npm, Stripe, PyPI), plus JWTs and AWS access key IDs. Each matched value is replaced with<redacted>.
Redaction is conservative and pattern-based — it reduces the risk of leaking secrets that happen to appear in a command’s arguments, but it is not exhaustive. New secret formats emerge regularly, so avoid piping real credentials through commands an agent runs. See the Audit Log reference for the full redaction scope.
Attack surface
The threat model enumerates the main attack surfaces and their mitigations.| Attack surface | What an attacker tries | Mitigation |
|---|---|---|
| Shell command parser | Craft a command string that exploits a parser edge case (unusual quoting, nested substitution, operator ambiguity) to hide a destructive payload | Unclosed-quote guard returns the raw string as one segment; variable references are preserved (not expanded) so dynamic substitutions can be detected; strict mode blocks unparseable commands; parser errors trigger fail-closed |
| Wrapper and interpreter stripping | Hide a destructive command behind sudo, env, bash -c, or an interpreter one-liner | Wrappers are stripped iteratively (with an iteration cap); shell wrappers and interpreter code are recursively re-analyzed up to 10 levels |
| Path traversal in rm analysis | Slip a dangerous rm -rf target past classification using symlinks or path tricks | Targets are resolved to canonical paths; $TMPDIR overrides pointing outside known temp dirs are detected; a residual TOCTOU window remains (see Known Limitations) |
| Rulebook supply chain | Serve a malicious rulebook from a GitHub source | Remote rulebooks are SHA-256-verified against the lockfile and schema-validated; a malicious rulebook can add rules but cannot remove built-in blocking |
| Secret leakage in audit logs | Get a secret written to the on-disk audit log | redactSecrets runs before any log write; the pattern list is maintained incrementally |
| Hook input parsing | Crash the hook with malformed JSON | JSON.parse failures trigger a deny rather than a crash; platform adapters perform additional validation |
| Audit log path traversal | Craft a session ID that writes outside the logs directory | The session ID is sanitized to a filesystem-safe form, length-capped, and rejects . and .. |
Severity calibration
Findings are classified to keep the response proportional to impact:- Critical — a vulnerability that lets a destructive command bypass all analysis and execute (for example a shell-parser bypass for
rm -rf /). - High — a vulnerability that weakens the boundary or allows partial bypass under specific conditions (for example a git-analysis bypass for unusual option ordering).
- Medium — limited impact on the core function or secondary features (for example audit-log path traversal or a redaction bypass for a specific format).
- Low — minimal security impact (for example information disclosure in diagnostic output).
Related pages
- Security policy — how to report a vulnerability responsibly.
- Design Principles — the rationale behind fail-closed and semantic analysis.
- Known Limitations — residual risks like the symlink TOCTOU window.
- Audit Log — where redacted command records are written.