Skip to main content
Beekeeper Docs

Security

Beekeeper's security posture and its known gaps, presented together.

Beekeeper runs with your full filesystem and network privileges, so it documents both what it defends and what it does not. This page presents the security posture and the known gaps together, on one page. For the complete threat model, see docs/THREAT-MODEL.md in the repository.

Security posture

Corroboration model

Block decisions require a second factor. By default:

Sources flagging a packageAction
1 sourcewarn: investigate, do not block
2 sourcesblock
3+ sourcesblock + quarantine

An adversary controlling a single catalog source can generate warnings but cannot escalate to a block without a second independent source, analogous to two-factor authentication. Beekeeper also applies per-severity thresholds (a critical match can escalate at a lower count) guarded by an all-versions-wildcard guard and degraded-source suppression.

Fail-closed defaults

Every enforcement path (the beekeeper check hook, the MCP gateway, and the Sentry) fails closed. A crash, timeout, oversized input, or missing/corrupt index produces a block, not an allow. Opting out (fail_mode: open) is explicit and documented as reducing security.

Sensitive-path enforcement (SPATH)

Beekeeper blocks agent reads (and shell-redirect writes) of credential paths outside the working directory: ~/.ssh, ~/.aws, ~/.cargo/credentials, .env globs, and editor MCP config directories, with normalization for Windows alternate data streams and trailing-dot tricks. The block is merged most-restrictive-wins, so a package_allowlist allow can never downgrade a credential-read block.

Self-protection

Because the agent runs as the file owner, OS permissions alone cannot stop it tampering with Beekeeper. The tool-call hook is the layer that can, so beekeeper check blocks the agent from: reading or writing the state directory (config, policies, audit, catalogs, quarantine); overwriting the running binary; removing Beekeeper's own hook entry (content-aware: edits to other hooks pass); and invoking Beekeeper's mutating subcommands via Bash (config set, hooks install/uninstall, protect install/uninstall). Human channels (editing files directly, running beekeeper in a terminal, beekeeper dashboard --admin) are unaffected; there is deliberately no in-band agent bypass.

beekeeper-self catalog

Before any enforcement command runs, and after every catalogs sync, Beekeeper checks whether the running binary's version appears in a separately-hosted, separately-keyed (Ed25519) beekeeper-self compromise feed. On a match it self-quarantines and refuses to run enforcement, while diagnostic commands (beekeeper version, beekeeper diag, beekeeper selftest, beekeeper policy validate) stay runnable. Only a bad signature fails closed; a network outage warns and continues.

Behavioral detection (Sentry, opt-in)

The Sentry is a privileged, opt-in behavioral monitor (beekeeper protect install), running on Linux (eBPF and fanotify), macOS (eslogger), and Windows (ETW). It correlates process, file, and network events into a small set of exfiltration-pattern rules, SENTRY-001 through SENTRY-008: credential-file clusters (001), credential-CLI bursts (002), first-outbound phone-home (003), fresh-extension correlation (004), exfiltration-signature fusion (005), an agent-CLI credential cluster (006), a generalized exfil fusion with no fresh-extension precondition (007), and persistence-location writes (008).

Rules fire only for processes descended from an editor (code/cursor/windsurf/codium) or a known agent CLI (claude/codex/cursor-agent/gemini/copilot/qwen/aider/ opencode/hermes), so a standalone-terminal agent is in scope, not only an editor extension. File-write events are ingested on all three platforms; DNS query events are ingested on Linux (eBPF) and Windows (ETW). The Sentry is detection-only: it writes audit records, it does not quarantine or kill (extension containment lives in the unprivileged watch/scan layer). See the known gaps below for what it does not yet correlate.

First-responder loop: sync-hit cross-reference, reversible quarantine, and Sentry targeted trace

When the background catalog sync produces a delta, Beekeeper runs a read-only cross-reference of the installed package inventory (npm, pip, cargo, and others) against the freshly-synced index. A match is audited as a scan hit. The cross-reference is strictly read-only: it never removes, disables, or edits a package.

Type-aware reversible quarantine

Quarantine works for both editor extensions and language packages. A quarantine is a reversible directory move plus a restore manifest (os.Rename with a JSON sidecar describing the artifact's original path, ecosystem, and reason). It is fully restorable at any time.

Auto-quarantine (opt-in, dry-run by default)

A config knob (auto_quarantine) controls whether scan hits trigger an automatic quarantine move:

  • Default state: disabled and dry-run. A fresh install quarantines nothing automatically.
  • When enabled: if a scan hit reaches the corroboration threshold (default 2 independent sources, clamped to [1,3]), and the artifact's on-disk path is known, Beekeeper moves the artifact to the quarantine directory, writes an audit incident, and surfaces a TUI incident for human review. The threshold counts distinct independent catalog sources; a single source can only warn.
  • Path unknown: if the installed path cannot be resolved, a pending-quarantine audit record is written rather than guessing at the path.
  • Dry-run mode: when dry_run is true, the audit record is written but no file is moved. Auto-quarantine also starts in dry-run, so the knob requires two explicit changes to produce a real move: enabled: true and dry_run: false.

See Configuration for the auto_quarantine block.

Purge is always human-gated

The reversible quarantine move is the automatic step. The destructive purge is never automatic. A catalog-quarantine incident surfaces in the TUI with a [P] purge (permanent) option and a [R] restore option. The CLI purge command requires a y/N confirmation. The separation is intentional: auto-quarantine removes a package from the active set and stops it running, while the irreversible step stays in human hands.

Move-source safety

The auto-quarantine move is bounded so a malicious package cannot weaponize it. The move source must be an absolute path, is refused if it is a symlink, junction, or reparse point (closing a redirection or time-of-check/time-of-use trick), and is refused if it names a system-critical root (a drive or filesystem root, C:\Windows, Program Files, or Beekeeper's own state and quarantine directories). A restore path read from a quarantine manifest is canonicalized and rejected for traversal, including Windows drive-relative, extended-length (\\?\), and alternate-data-stream forms, so a tampered manifest cannot become an arbitrary write. First-responder audit records are redacted before they are written, like every other audit producer. See docs/THREAT-MODEL.md §12 for the full trust-boundary analysis.

Catalog-to-Sentry targeted trace (detection-only)

A scan hit that reaches the corroboration threshold (the same default of two independent sources used for auto-quarantine) also records the flagged-and-installed artifact into a Sentry target list (sentry-targets.json); a single-source warn does not tighten anything. The Sentry daemon consults this list to tighten correlation on that artifact's process subtree: lower credential-access and credential-CLI thresholds, faster escalation. The target list is reloaded every 60 seconds, so new hits take effect without a daemon restart.

This is detection-only: no kill, no isolation, no network cut. It only changes which Sentry alerts fire and how quickly. The live OS-tap escalation (eBPF on Linux, ETW on Windows, eslogger on macOS) is validated in the CI platform matrix; the target-list logic itself is unit-tested on all platforms.

Confirmed-outcome corpus and the local catalog overlay

Beyond the live scan-hit path above, Beekeeper keeps a local record of what it has confirmed. Every incident is written as a four-layer event (behavior, decision, outcome, and context) to an append-only corpus (corpus/beekeeper-corpus.ndjson in the state directory), owner-only (0600) and run through the same redaction step as the audit log. The outcome layer (a confirmed true_label) is the part that cannot be reconstructed after the fact, so it is present from the first write and starts as unresolved.

An adjudication engine assigns that outcome off the hot path: it never runs inside the synchronous beekeeper check evaluation, only in the catalog-sync daemon, with a bounded batch pass on each beekeeper catalogs sync as the no-daemon fallback. Confidence is corroboration-gated on the same two-source bar as the rest of Beekeeper: a single source is watch weight, two or more independent sources are enforce weight.

When adjudication confirms a package malicious, the first responder does three local things, and deletes nothing:

  • arms the TUI quarantine card for any matching install present on the machine;
  • elevates the detection-only Sentry watch on that package's process subtree, gated at the same corroboration threshold (a single source does not tighten);
  • adds a local-only catalog overlay entry (owner-only) that survives beekeeper catalogs sync, so the next install of that package is caught immediately rather than waiting for the upstream feed to carry it.

Purge stays human-gated exactly as above; the corpus loop never auto-purges, in any configuration. Nothing leaves the machine: the corpus and the overlay are local files with no remote sink, and machine and repository identifiers are stored as non-reversible HMAC fingerprints. The push-envelope wire format that would let an organization or community share confirmed signatures is frozen, but no transport ships in this release — cross-machine sharing is a later milestone, not a current capability. See docs/THREAT-MODEL.md §13 for the local-loop trust boundary and the named residual gaps.

What is proven, what is not

The quarantine and Sentry paths are sound by design. The live OS-tap escalation triggered by a catalog hit is CI-validated, not yet red-team-proven with a purpose-built exploit in a production environment.

Not yet shipped (do not infer from this page that they are): destructive package-manager uninstall and lockfile rewrite; browser-extension and MCP-config quarantine (those artifact types are scanned but not yet in the auto-quarantine path).

LlamaFirewall prompt-injection scan (opt-in, experimental)

LlamaFirewall is an opt-in, experimental layer, off by default. When enabled it runs a supervised local Python sidecar that scores agent tool output with PromptGuard 2 (prompt-injection) and CodeShield (unsafe code). Inference is fully local: there is no API key and no third-party cloud. The earlier AlignmentCheck path, which would have sent agent context to a third-party cloud (Together AI), has been removed entirely.

Posture notes:

  • Gated model. PromptGuard 2 is a gated Hugging Face model. You must accept the Llama license and run huggingface-cli login, then beekeeper llamafirewall install bootstraps a CPU-only venv (no CUDA wheels) and pre-pulls the 22M model into a pinned cache under the state directory. Until that is done the sidecar has nothing to load. This is per-operator and one-time per machine, not something Beekeeper can do once for everyone. Meta's Llama license must be accepted by each user individually and the gated weights cannot be redistributed, so Beekeeper ships no model and bundles no token. Your Hugging Face token stays on your machine (Beekeeper never sees or transmits it) and the model cache lives under your state directory.
  • Non-blocking by default. The injection scan runs on the PostToolUse hook as a forensic signal; it does not block the tool call, and it is not "on for every tool call".
  • Fail-closed on crash. A sidecar crash, missing model, or scan error is treated as a block (never a silent clean) unless you explicitly set fail_mode: open.
  • Local IPC. The Go supervisor and the sidecar talk over loopback TCP with a per-launch bearer token, so another local process cannot drive the scanner.
  • CI-only end-to-end. The real-sidecar checks (benign, injection, unsafe-code, crash-fail-closed) run only in a gated CI job that has accepted the Llama license; they are not part of the default test suite.
  • Native Windows is unsupported for the sidecar. CodeShield depends on semgrep, which has no native Windows build, so beekeeper llamafirewall install cannot complete on native Windows. Use WSL or a Linux/macOS host. The gated prompt-injection model itself is platform-neutral; the limitation is CodeShield's semgrep dependency.

Build hardening

Reproducible builds (-trimpath -buildvcs=false -mod=readonly), keyless Sigstore/cosign signing, SLSA Level 3 provenance, and a CycloneDX SBOM. See Installation for the verification commands.

Known gaps

These are documented so you don't develop false confidence. None of them relax the fail-closed enforcement path; most are detection-coverage or configuration-trust limits.

  • Hermes is fail-OPEN. Hermes ignores hook exit codes; a block is carried only by emitting {"action":"block","message":"..."} on stdout. Any timeout, crash, or non-JSON output makes Hermes allow the call. Prefer the MCP gateway for Hermes. (See Integration.)
  • Tier-3 native tools are UNGUARDED. Kilo and Trae have no upstream pre-exec hook; only MCP tools routed through the gateway are intercepted. Their native Bash/file/shell tools bypass Beekeeper entirely.
  • Only Claude Code is live-verified. The other 16 harnesses are implemented against documented contracts and contract-shape tested, but not verified against a running harness in CI.
  • Gateway remote-bind exposure. Binding --bind 0.0.0.0 exposes the policy-decision proxy over plain HTTP (the bearer token travels in cleartext). The CLI help text promises an allow_remote_gateway config gate, but that gate is not implemented; --bind flows straight to net.Listen. Do not bind the gateway to a non-loopback interface.
  • Project config can relax fail-closed. A project-layer .beekeeper/config.json with {"fail_mode":"open"} is honored and turns every fail-closed net into fail-open for that working tree (see Configuration).
  • Windsurf fail-OPEN on non-2 exit; OpenCode subagent gap. Windsurf only honors exit code 2; OpenCode's plugin does not intercept subagent task calls (issue #5894).
  • Unlisted package managers. deno, mvn, and nuget parse as "no package identified" and are allowed by default; the Sentry behavioral layer is the second signal there. Command chaining (&&, ||, ;, |, &, newlines) and leading environment-variable assignments (cd x && FOO=bar npm install evil) ARE handled by internal/pkgparse (it splits on shell separators, honoring quotes, and strips leading env assignments), so they are not a bypass.
  • Catalog poisoning (coordinated). An adversary controlling 2+ sources can manufacture false-positive blocks to coerce a user into disabling enforcement. Sanity bounds and audit provenance are partial mitigations.
  • Bumblebee signature is a presence check (TM-B-02). In the live decision path, a Bumblebee entry's "signed" status is a non-empty-field check, not full Ed25519 verification (only beekeeper-self is cryptographically verified). Tracked for remediation.
  • Linux fanotify mmap gap. Libraries mmap-loaded before the Sentry's watch was placed are not re-intercepted.
  • Windows Sentry missing PPID. File/network ETW events carry no parent-PID, so a short-lived child can lose editor-descendant attribution (detection-coverage only; enforcement unaffected).
  • DNS is ingested but not correlated. DNS query events are captured on Linux and Windows, but no Sentry rule consumes them yet, so DNS-tunnel exfiltration is ingested but not detected. SENTRY-003 (first-outbound) has no destination allowlist and cannot identify the endpoint.
  • No process-memory event source. /proc/<pid>/maps-style secret scraping has no event source and is undetected.
Not enforced in this release: release_age, lifecycle_script_allowlist

release_age (minimum package age) and lifecycle_script_allowlist rules in policy files are not enforced by the policy overlay in v1.3.0; they require metadata not present in a pure tool call. They are informational / dry-run only. See Configuration.

How this is validated

Beekeeper's coverage claims are auditable, not asserted. Validation is split into three tiers:

  • Tier A (locally testable) is held at full coverage by a coverage gate: every production Go file is either covered by a test or carries a reason-coded, fail-closed no-test allowlist entry, so the coverage claim cannot be silently weakened. A 17-harness conformance suite golden-file-tests every installer config and per-harness deny contract.
  • Tier B (platform-bound) runs in a cross-platform CI matrix: two Linux kernels, macOS, and Windows, exercising eBPF, eslogger, ETW, Unix peer-cred auth, and three cross-compiled targets. Five fuzz targets (policy engine, IPC parser, catalog parser, MCP parser, and the Sentry rule evaluator) run as a blocking release gate.
  • Tier C (irreducibly manual) is a signed validation register: each of the 16 non-Claude-Code harnesses and the gated-model sidecar end-to-end has a written live-block procedure, an expected result, and a sign-off line, so "fully validated" is a checklist you can read.

The tier model and the register live in docs/validation-posture.md and docs/validation-register.md in the repository.

The exit-2 deny contract

Beekeeper signals a block by exiting 2, the one exit code agent harnesses treat as a deny rather than a generic hook error. An earlier internal design exited 1, which most harnesses interpret as a hook error and ignore, so a block was audited but the tool still ran. The shipped contract uses exit 2, plus the per-harness deny JSON (hookSpecificOutput for Claude Code, {"action":"block"} for Hermes), so the block actually takes effect. If you ever re-register hooks, beekeeper hooks install --target <harness> writes the correct contract.

On this page