Security
Beekeeper's security posture and its known gaps, presented together.
Beekeeper runs with your full filesystem and network privileges, so it documents
both what it defends and what it does not. This page presents the security
posture and the known gaps together, on one page. For the complete threat
model, see docs/THREAT-MODEL.md in the repository.
Security posture
Corroboration model
Block decisions require a second factor. By default:
| Sources flagging a package | Action |
|---|---|
| 1 source | warn: investigate, do not block |
| 2 sources | block |
| 3+ sources | block + quarantine |
An adversary controlling a single catalog source can generate warnings but cannot
escalate to a block without a second independent source, analogous to two-factor
authentication. Beekeeper also applies per-severity thresholds (a critical
match can escalate at a lower count) guarded by an all-versions-wildcard guard and
degraded-source suppression.
Fail-closed defaults
Every enforcement path (the beekeeper check hook, the MCP gateway, and the
Sentry) fails closed. A crash, timeout, oversized input, or missing/corrupt
index produces a block, not an allow. Opting out (fail_mode: open) is explicit
and documented as reducing security.
Sensitive-path enforcement (SPATH)
Beekeeper blocks agent reads (and shell-redirect writes) of credential paths
outside the working directory: ~/.ssh, ~/.aws, ~/.cargo/credentials, .env
globs, and editor MCP config directories, with normalization for Windows
alternate data streams and trailing-dot tricks. The block is merged
most-restrictive-wins, so a package_allowlist allow can never downgrade a
credential-read block.
Self-protection
Because the agent runs as the file owner, OS permissions alone cannot stop it
tampering with Beekeeper. The tool-call hook is the layer that can, so
beekeeper check blocks the agent from: reading or writing the state
directory (config, policies, audit, catalogs, quarantine); overwriting the
running binary; removing Beekeeper's own hook entry (content-aware: edits to
other hooks pass); and invoking Beekeeper's mutating subcommands via Bash
(config set, hooks install/uninstall, protect install/uninstall). Human
channels (editing files directly, running beekeeper in a terminal,
beekeeper dashboard --admin) are unaffected; there is deliberately no in-band
agent bypass.
beekeeper-self catalog
Before any enforcement command runs, and after every catalogs sync, Beekeeper
checks whether the running binary's version appears in a separately-hosted,
separately-keyed (Ed25519) beekeeper-self compromise feed. On a match it
self-quarantines and refuses to run enforcement, while diagnostic commands
(beekeeper version, beekeeper diag, beekeeper selftest,
beekeeper policy validate) stay runnable. Only a bad signature fails closed;
a network outage warns and continues.
Behavioral detection (Sentry, opt-in)
The Sentry is a privileged, opt-in behavioral monitor (beekeeper protect install),
running on Linux (eBPF and fanotify), macOS (eslogger), and Windows (ETW). It
correlates process, file, and network events into a small set of
exfiltration-pattern rules, SENTRY-001 through SENTRY-008: credential-file
clusters (001), credential-CLI bursts (002), first-outbound phone-home (003),
fresh-extension correlation (004), exfiltration-signature fusion (005), an agent-CLI
credential cluster (006), a generalized exfil fusion with no fresh-extension
precondition (007), and persistence-location writes (008).
Rules fire only for processes descended from an editor (code/cursor/windsurf/codium) or a known agent CLI (claude/codex/cursor-agent/gemini/copilot/qwen/aider/ opencode/hermes), so a standalone-terminal agent is in scope, not only an editor extension. File-write events are ingested on all three platforms; DNS query events are ingested on Linux (eBPF) and Windows (ETW). The Sentry is detection-only: it writes audit records, it does not quarantine or kill (extension containment lives in the unprivileged watch/scan layer). See the known gaps below for what it does not yet correlate.
First-responder loop: sync-hit cross-reference, reversible quarantine, and Sentry targeted trace
When the background catalog sync produces a delta, Beekeeper runs a read-only cross-reference of the installed package inventory (npm, pip, cargo, and others) against the freshly-synced index. A match is audited as a scan hit. The cross-reference is strictly read-only: it never removes, disables, or edits a package.
Type-aware reversible quarantine
Quarantine works for both editor extensions and language packages. A quarantine
is a reversible directory move plus a restore manifest (os.Rename with a
JSON sidecar describing the artifact's original path, ecosystem, and reason).
It is fully restorable at any time.
Auto-quarantine (opt-in, dry-run by default)
A config knob (auto_quarantine) controls whether scan hits trigger an
automatic quarantine move:
- Default state: disabled and dry-run. A fresh install quarantines nothing automatically.
- When enabled: if a scan hit reaches the corroboration threshold (default 2 independent sources, clamped to [1,3]), and the artifact's on-disk path is known, Beekeeper moves the artifact to the quarantine directory, writes an audit incident, and surfaces a TUI incident for human review. The threshold counts distinct independent catalog sources; a single source can only warn.
- Path unknown: if the installed path cannot be resolved, a
pending-quarantineaudit record is written rather than guessing at the path. - Dry-run mode: when
dry_runistrue, the audit record is written but no file is moved. Auto-quarantine also starts in dry-run, so the knob requires two explicit changes to produce a real move:enabled: trueanddry_run: false.
See Configuration for the auto_quarantine block.
Purge is always human-gated
The reversible quarantine move is the automatic step. The destructive purge is never automatic. A catalog-quarantine incident surfaces in the TUI with a [P] purge (permanent) option and a [R] restore option. The CLI purge command requires a y/N confirmation. The separation is intentional: auto-quarantine removes a package from the active set and stops it running, while the irreversible step stays in human hands.
Move-source safety
The auto-quarantine move is bounded so a malicious package cannot weaponize it. The
move source must be an absolute path, is refused if it is a symlink, junction, or
reparse point (closing a redirection or time-of-check/time-of-use trick), and is
refused if it names a system-critical root (a drive or filesystem root, C:\Windows,
Program Files, or Beekeeper's own state and quarantine directories). A restore path
read from a quarantine manifest is canonicalized and rejected for traversal,
including Windows drive-relative, extended-length (\\?\), and alternate-data-stream
forms, so a tampered manifest cannot become an arbitrary write. First-responder audit
records are redacted before they are written, like every other audit producer. See
docs/THREAT-MODEL.md §12 for the full trust-boundary analysis.
Catalog-to-Sentry targeted trace (detection-only)
A scan hit that reaches the corroboration threshold (the same default of two
independent sources used for auto-quarantine) also records the flagged-and-installed
artifact into a Sentry target list (sentry-targets.json); a single-source warn does
not tighten anything. The Sentry daemon consults this list to tighten correlation on
that artifact's process subtree: lower credential-access and credential-CLI
thresholds, faster escalation. The target list is reloaded every 60 seconds, so new
hits take effect without a daemon restart.
This is detection-only: no kill, no isolation, no network cut. It only
changes which Sentry alerts fire and how quickly. The live OS-tap escalation
(eBPF on Linux, ETW on Windows, eslogger on macOS) is validated in the CI
platform matrix; the target-list logic itself is unit-tested on all platforms.
Confirmed-outcome corpus and the local catalog overlay
Beyond the live scan-hit path above, Beekeeper keeps a local record of what it
has confirmed. Every incident is written as a four-layer event (behavior,
decision, outcome, and context) to an append-only corpus
(corpus/beekeeper-corpus.ndjson in the state directory), owner-only (0600)
and run through the same redaction step as the audit log. The outcome layer (a
confirmed true_label) is the part that cannot be reconstructed after the fact,
so it is present from the first write and starts as unresolved.
An adjudication engine assigns that outcome off the hot path: it never runs
inside the synchronous beekeeper check evaluation, only in the catalog-sync
daemon, with a bounded batch pass on each beekeeper catalogs sync as the
no-daemon fallback. Confidence is corroboration-gated on the same two-source bar
as the rest of Beekeeper: a single source is watch weight, two or more
independent sources are enforce weight.
When adjudication confirms a package malicious, the first responder does three local things, and deletes nothing:
- arms the TUI quarantine card for any matching install present on the machine;
- elevates the detection-only Sentry watch on that package's process subtree, gated at the same corroboration threshold (a single source does not tighten);
- adds a local-only catalog overlay entry (owner-only) that survives
beekeeper catalogs sync, so the next install of that package is caught immediately rather than waiting for the upstream feed to carry it.
Purge stays human-gated exactly as above; the corpus loop never auto-purges, in
any configuration. Nothing leaves the machine: the corpus and the overlay are
local files with no remote sink, and machine and repository identifiers are
stored as non-reversible HMAC fingerprints. The push-envelope wire format that
would let an organization or community share confirmed signatures is frozen, but
no transport ships in this release — cross-machine sharing is a later
milestone, not a current capability. See docs/THREAT-MODEL.md §13 for the
local-loop trust boundary and the named residual gaps.
What is proven, what is not
The quarantine and Sentry paths are sound by design. The live OS-tap escalation triggered by a catalog hit is CI-validated, not yet red-team-proven with a purpose-built exploit in a production environment.
Not yet shipped (do not infer from this page that they are): destructive package-manager uninstall and lockfile rewrite; browser-extension and MCP-config quarantine (those artifact types are scanned but not yet in the auto-quarantine path).
LlamaFirewall prompt-injection scan (opt-in, experimental)
LlamaFirewall is an opt-in, experimental layer, off by default. When enabled it runs a supervised local Python sidecar that scores agent tool output with PromptGuard 2 (prompt-injection) and CodeShield (unsafe code). Inference is fully local: there is no API key and no third-party cloud. The earlier AlignmentCheck path, which would have sent agent context to a third-party cloud (Together AI), has been removed entirely.
Posture notes:
- Gated model. PromptGuard 2 is a gated Hugging Face model. You must accept
the Llama license and run
huggingface-cli login, thenbeekeeper llamafirewall installbootstraps a CPU-only venv (no CUDA wheels) and pre-pulls the 22M model into a pinned cache under the state directory. Until that is done the sidecar has nothing to load. This is per-operator and one-time per machine, not something Beekeeper can do once for everyone. Meta's Llama license must be accepted by each user individually and the gated weights cannot be redistributed, so Beekeeper ships no model and bundles no token. Your Hugging Face token stays on your machine (Beekeeper never sees or transmits it) and the model cache lives under your state directory. - Non-blocking by default. The injection scan runs on the
PostToolUsehook as a forensic signal; it does not block the tool call, and it is not "on for every tool call". - Fail-closed on crash. A sidecar crash, missing model, or scan error is
treated as a block (never a silent clean) unless you explicitly set
fail_mode: open. - Local IPC. The Go supervisor and the sidecar talk over loopback TCP with a per-launch bearer token, so another local process cannot drive the scanner.
- CI-only end-to-end. The real-sidecar checks (benign, injection, unsafe-code, crash-fail-closed) run only in a gated CI job that has accepted the Llama license; they are not part of the default test suite.
- Native Windows is unsupported for the sidecar. CodeShield depends on
semgrep, which has no native Windows build, sobeekeeper llamafirewall installcannot complete on native Windows. Use WSL or a Linux/macOS host. The gated prompt-injection model itself is platform-neutral; the limitation is CodeShield'ssemgrepdependency.
Build hardening
Reproducible builds (-trimpath -buildvcs=false -mod=readonly), keyless
Sigstore/cosign signing, SLSA Level 3 provenance, and a CycloneDX SBOM. See
Installation for the verification commands.
Known gaps
These are documented so you don't develop false confidence. None of them relax the fail-closed enforcement path; most are detection-coverage or configuration-trust limits.
- Hermes is fail-OPEN. Hermes ignores hook exit codes; a block is carried
only by emitting
{"action":"block","message":"..."}on stdout. Any timeout, crash, or non-JSON output makes Hermes allow the call. Prefer the MCP gateway for Hermes. (See Integration.) - Tier-3 native tools are UNGUARDED. Kilo and Trae have no upstream pre-exec hook; only MCP tools routed through the gateway are intercepted. Their native Bash/file/shell tools bypass Beekeeper entirely.
- Only Claude Code is live-verified. The other 16 harnesses are implemented against documented contracts and contract-shape tested, but not verified against a running harness in CI.
- Gateway remote-bind exposure. Binding
--bind 0.0.0.0exposes the policy-decision proxy over plain HTTP (the bearer token travels in cleartext). The CLI help text promises anallow_remote_gatewayconfig gate, but that gate is not implemented;--bindflows straight tonet.Listen. Do not bind the gateway to a non-loopback interface. - Project config can relax fail-closed. A project-layer
.beekeeper/config.jsonwith{"fail_mode":"open"}is honored and turns every fail-closed net into fail-open for that working tree (see Configuration). - Windsurf fail-OPEN on non-2 exit; OpenCode subagent gap. Windsurf only
honors exit code 2; OpenCode's plugin does not intercept subagent
taskcalls (issue #5894). - Unlisted package managers.
deno,mvn, andnugetparse as "no package identified" and are allowed by default; the Sentry behavioral layer is the second signal there. Command chaining (&&,||,;,|,&, newlines) and leading environment-variable assignments (cd x && FOO=bar npm install evil) ARE handled byinternal/pkgparse(it splits on shell separators, honoring quotes, and strips leading env assignments), so they are not a bypass. - Catalog poisoning (coordinated). An adversary controlling 2+ sources can manufacture false-positive blocks to coerce a user into disabling enforcement. Sanity bounds and audit provenance are partial mitigations.
- Bumblebee signature is a presence check (TM-B-02). In the live decision
path, a Bumblebee entry's "signed" status is a non-empty-field check, not full
Ed25519 verification (only
beekeeper-selfis cryptographically verified). Tracked for remediation. - Linux fanotify mmap gap. Libraries
mmap-loaded before the Sentry's watch was placed are not re-intercepted. - Windows Sentry missing PPID. File/network ETW events carry no parent-PID, so a short-lived child can lose editor-descendant attribution (detection-coverage only; enforcement unaffected).
- DNS is ingested but not correlated. DNS query events are captured on Linux and Windows, but no Sentry rule consumes them yet, so DNS-tunnel exfiltration is ingested but not detected. SENTRY-003 (first-outbound) has no destination allowlist and cannot identify the endpoint.
- No process-memory event source.
/proc/<pid>/maps-style secret scraping has no event source and is undetected.
release_age (minimum package age) and lifecycle_script_allowlist rules in
policy files are not enforced by the policy overlay in v1.3.0; they require
metadata not present in a pure tool call. They are informational / dry-run only.
See Configuration.
How this is validated
Beekeeper's coverage claims are auditable, not asserted. Validation is split into three tiers:
- Tier A (locally testable) is held at full coverage by a coverage gate: every production Go file is either covered by a test or carries a reason-coded, fail-closed no-test allowlist entry, so the coverage claim cannot be silently weakened. A 17-harness conformance suite golden-file-tests every installer config and per-harness deny contract.
- Tier B (platform-bound) runs in a cross-platform CI matrix: two Linux kernels,
macOS, and Windows, exercising eBPF,
eslogger, ETW, Unix peer-cred auth, and three cross-compiled targets. Five fuzz targets (policy engine, IPC parser, catalog parser, MCP parser, and the Sentry rule evaluator) run as a blocking release gate. - Tier C (irreducibly manual) is a signed validation register: each of the 16 non-Claude-Code harnesses and the gated-model sidecar end-to-end has a written live-block procedure, an expected result, and a sign-off line, so "fully validated" is a checklist you can read.
The tier model and the register live in docs/validation-posture.md and
docs/validation-register.md in the repository.
The exit-2 deny contract
Beekeeper signals a block by exiting 2, the one exit code agent harnesses
treat as a deny rather than a generic hook error. An earlier internal design
exited 1, which most harnesses interpret as a hook error and ignore, so a
block was audited but the tool still ran. The shipped contract uses exit 2, plus
the per-harness deny JSON (hookSpecificOutput for Claude Code,
{"action":"block"} for Hermes), so the block actually takes effect. If you ever
re-register hooks, beekeeper hooks install --target <harness> writes the correct
contract.