CC Safety Net Security Model

CC Safety Net sits between an untrusted command source (an AI coding agent) and a trusted execution environment (the host shell). This page documents the trust boundaries, how fail-closed behavior is enforced, how secrets are protected, and the attack surface. To report a vulnerability, see the security policy instead.

Trust boundaries

Primary boundary: AI agent to shell

The core trust boundary sits between the AI coding agent and the host shell. CC Safety Net is the gatekeeper.

Untrusted side — command strings generated by AI agents. These are treated as potentially hostile because agents can be manipulated via prompt injection, confused context, or adversarial instructions into producing destructive commands.
Trusted side — the host shell where commands would execute.

Every command that reaches the shell tool on a supported platform flows through the analysis engine before it is allowed to run. If analysis returns a block reason, the command is denied.

Secondary boundaries

Four secondary boundaries cross into CC Safety Net from an external source. Each one is validated before it can influence analysis.

Boundary	Source	How it is validated
User configuration	Custom rules in JSON files on disk	Parsed and schema-validated; malformed rules fail closed
Rulebook sources	Rulebooks fetched from GitHub or local directories	Remote rulebooks integrity-checked via SHA-256 digests in the lockfile
Hook input JSON	Each agent’s JSON payload on stdin	Parsed defensively; malformed JSON triggers a deny
Environment variables	Mode flags and path overrides (`CC_SAFETY_NET_*`, `TMPDIR`, etc.)	Read explicitly; security-critical values treated as untrusted

Fail-closed enforcement

Fail-closed is the foundational safety property: when analysis fails, config is invalid, or input cannot be parsed, the command is blocked rather than allowed. It is enforced at every entry point.

Hook entry points

The hook adapter wraps the analysis call in a try/catch. If analysis throws, the hook emits a deny decision with a “failed closed” reason instead of letting the command proceed. This applies to every stdin-based hook agent (Claude Code, Gemini CLI, Copilot CLI, Kimi Code).

Plugin and extension entry points

The OpenCode plugin and Pi extension apply the same pattern — analysis errors are caught and re-surfaced as block messages so the platform treats them as denied commands.

Strict mode

Strict mode extends fail-closed to commands the shell parser cannot safely tokenize, so unparseable input is blocked rather than passed through.

Config validation

When rulebook loading or validation produces errors, a fail-closed reason is attached to the config. Downstream analysis consults that field, so a broken rulebook state results in blocking rather than silent rule skipping.

See Design Principles for the rationale.

Secret redaction

Before any command or segment text is written to the audit log or returned to the agent, it passes through automatic secret redaction. The redactor scrubs PEM private keys, database URL environment variables, generic secret-bearing env assignments, common secret HTTP headers, URL credentials, and known provider token prefixes (GitHub, Slack, npm, Stripe, PyPI), plus JWTs and AWS access key IDs. Each matched value is replaced with <redacted>. Redaction is conservative and pattern-based — it reduces the risk of leaking secrets that happen to appear in a command’s arguments, but it is not exhaustive. New secret formats emerge regularly, so avoid piping real credentials through commands an agent runs. See the Audit Log reference for the full redaction scope.

Attack surface

The threat model enumerates the main attack surfaces and their mitigations.

Attack surface	What an attacker tries	Mitigation
Shell command parser	Craft a command string that exploits a parser edge case (unusual quoting, nested substitution, operator ambiguity) to hide a destructive payload	Unclosed-quote guard returns the raw string as one segment; variable references are preserved (not expanded) so dynamic substitutions can be detected; strict mode blocks unparseable commands; parser errors trigger fail-closed
Wrapper and interpreter stripping	Hide a destructive command behind `sudo`, `env`, `bash -c`, or an interpreter one-liner	Wrappers are stripped iteratively (with an iteration cap); shell wrappers and interpreter code are recursively re-analyzed up to 10 levels
Path traversal in rm analysis	Slip a dangerous `rm -rf` target past classification using symlinks or path tricks	Targets are resolved to canonical paths; `$TMPDIR` overrides pointing outside known temp dirs are detected; a residual TOCTOU window remains (see Known Limitations)
Rulebook supply chain	Serve a malicious rulebook from a GitHub source	Remote rulebooks are SHA-256-verified against the lockfile and schema-validated; a malicious rulebook can add rules but cannot remove built-in blocking
Secret leakage in audit logs	Get a secret written to the on-disk audit log	`redactSecrets` runs before any log write; the pattern list is maintained incrementally
Hook input parsing	Crash the hook with malformed JSON	`JSON.parse` failures trigger a deny rather than a crash; platform adapters perform additional validation
Audit log path traversal	Craft a session ID that writes outside the logs directory	The session ID is sanitized to a filesystem-safe form, length-capped, and rejects `.` and `..`

Network-level attacks, denial of service via resource exhaustion, and attacks on the agent platform itself are out of scope — CC Safety Net makes no network requests during command analysis.

Severity calibration

Findings are classified to keep the response proportional to impact:

Critical — a vulnerability that lets a destructive command bypass all analysis and execute (for example a shell-parser bypass for rm -rf /).
High — a vulnerability that weakens the boundary or allows partial bypass under specific conditions (for example a git-analysis bypass for unusual option ordering).
Medium — limited impact on the core function or secondary features (for example audit-log path traversal or a redaction bypass for a specific format).
Low — minimal security impact (for example information disclosure in diagnostic output).

Security policy — how to report a vulnerability responsibly.
Design Principles — the rationale behind fail-closed and semantic analysis.
Known Limitations — residual risks like the symlink TOCTOU window.
Audit Log — where redacted command records are written.

​Trust boundaries

​Primary boundary: AI agent to shell

​Secondary boundaries

​Fail-closed enforcement

​Secret redaction

​Attack surface

​Severity calibration

​Related pages