v1.0.0: First Stable Release
The first public release of Beekeeper: threat intelligence for autonomous coding agents. It intercepts every tool call, package install, file access, and network egress before it executes, evaluates it against corroboration-based threat intelligence, and blocks or quarantines threats across 17 agent harnesses.
Overview
Beekeeper v1.0.0 is the first public release. It is a single static Go binary
(beekeeper) that mediates autonomous-agent tool calls before they execute and
evaluates them against unified, corroboration-based threat intelligence,
fail-closed by default. A hijacked or off-task agent cannot successfully act on
the developer's machine without Beekeeper deciding to permit it.
This release consolidates the project's full internal development history into one stable surface: the corroboration engine, sensitive-path enforcement, the package-manager nudge, editor-extension defense, cross-platform behavioral monitoring, an opt-in prompt-injection sidecar, policy-as-code, the TUI dashboard, config self-protection, and a validated release gate. There are no earlier public versions to migrate from.
Provenance. Beekeeper ships on GitHub Releases with the canonical go install
path and the signed release artifacts described below. The verification commands
let you audit the provenance story yourself before you trust the binary.
Highlights
1. Fail-closed hook handler + corroboration policy engine
beekeeper check evaluates tool calls against an mmap catalog index under hard
caps (1 MB stdin / 5 s / 256 MB). The pure internal/policy corroboration engine
applies a three-tier decision: one trusted source warns, two sources block, three
sources block and recommend quarantine. Sources are Bumblebee, OSV, and Socket.
Per-severity thresholds let a critical match escalate at a lower count, guarded
by an all-versions-wildcard guard and degraded-source suppression so a single
poisoned source cannot force a block. The same engine is called identically from
the hook handler, the MCP gateway, and the Sentry correlation layer.
On a block the hook emits the per-harness deny contract that the agent actually
honors (exit 2 plus hookSpecificOutput for Claude Code, {"action":"block"}
for Hermes, and so on). Unknown or unconfigured harness IDs fail closed.
2. Multi-harness enforcement across 17 agents
Hook installers cover 17 agent harnesses across three honesty tiers: 10 Tier-1
agents get full exit-2 deny enforcement (Claude Code, Codex, Cursor, Augment,
CodeBuddy, Qwen Code, Gemini CLI, Copilot, Antigravity, Windsurf); 3 Tier-2 agents
work with documented caveats (Hermes is structurally fail-open, Cline is
macOS/Linux only, OpenCode misses subagent task calls); and 4 Tier-3 agents
(Kilo, Trae, Continue, OpenClaw) are covered only through the MCP gateway, with
their native tools left explicitly unguarded. Only Claude Code is live-verified
end to end; the other 16 are contract-shape tested and listed in a signed manual
validation register (see Highlight 10).
3. Sensitive-path enforcement
policy.EvaluatePath blocks agent reads, and shell-redirect writes, of credential
paths outside the working directory: ~/.ssh, ~/.aws, ~/.cargo/credentials,
.env globs, and editor MCP config directories. Canonicalization closes evasion
gaps (tilde and $VAR / ${VAR} / %VAR% expansion, symlink and ancestor-symlink
resolution, Windows alternate-data-stream and trailing-dot variants). The block is
merged most-restrictive-wins, so an allowlist can never downgrade a credential-read
block.
4. Package-manager nudge and supply-chain matching
A single pure internal/pkgparse package catalog-matches npm, pnpm, bun, and
yarn installs alike, including chained and env-prefixed commands. The
internal/nudge subsystem advises (soft) or rewrites and blocks (hard) installs
toward hardened package managers, with a detection-independent block mode that does
not fail open. First hook install enables supply-chain enforcement by default.
5. Editor-extension defense
Agent --install-extension calls are intercepted before the extension lands. An
fsnotify watcher monitors the extension directories, and the watch, scan, and
quarantine workflow closes the Nx Console-class attack surface where a compromised
agent silently installs a trojanized extension.
6. Cross-platform Sentry (opt-in, detection-only)
A privileged behavioral monitor, opt-in via beekeeper protect install, correlates
process, file, and network events on Linux (eBPF and fanotify), macOS (eslogger,
no entitlement), and Windows (ETW, no CGO). Its rule set spans SENTRY-001 through
SENTRY-008: credential-file clusters, credential-CLI bursts, first-outbound
phone-home, fresh-extension correlation, exfiltration-signature fusion, an
agent-CLI credential cluster (006), a generalized exfil fusion with no
fresh-extension precondition (007), and persistence-location writes (008). Scope
covers both editor-extension and agent-CLI process trees, so standalone-terminal
agents are in scope, not just editor extensions. File-write events are ingested on
all three platforms; DNS queries are ingested on Linux and Windows. The Sentry is
detection-only: it writes audit records, it does not quarantine or kill.
7. Background catalog sync
Threat-intel freshness is automatic. Alongside the manual beekeeper catalogs sync,
an unprivileged per-user daemon (beekeeper catalogs daemon install) syncs on an
interval (default 2 hours, clamped to a 2 to 24 hour range) using conditional
ETag requests, via a systemd user timer, a macOS LaunchAgent, or a Windows
scheduled task. The interval is a self-defended setting: a project-layer config
cannot disable it or loosen the cadence.
8. LlamaFirewall prompt-injection sidecar (opt-in, experimental, local)
An optional supervised Python sidecar scores agent tool output with PromptGuard 2
and agent-generated code with CodeShield. Inference is fully local: no API key and
no third-party cloud (the earlier Together AI AlignmentCheck path was removed
entirely). It is non-blocking by default and fails closed on crash, missing
model, or scan error. The gated 22M model is bootstrapped per operator via
beekeeper llamafirewall install. The full sidecar runs on Linux and macOS; native
Windows is unsupported because CodeShield's semgrep dependency has no Windows
build.
9. Self-protection, policy-as-code, TUI, and audit
Because the agent runs as the file owner, the tool-call hook is the layer that can
stop it tampering with Beekeeper: agents cannot read or write the state directory,
overwrite the binary, remove their own hook entry (content-aware, so other hooks
stay editable), or invoke Beekeeper's mutating subcommands through Bash. Declarative
JSON policies (policy validate/test/list) are enforced live across
check/gateway/watch/scan over a five-layer config merge. A Bubble Tea v2 TUI
dashboard surfaces live activity, alerts, catalog freshness, scan, policy,
quarantine, and health. Every decision is written to a single NDJSON audit log
(beekeeper.ndjson, owner-only, not rotated) with optional syslog, OTLP, and HTTPS
sinks and an audit query/tail/export CLI.
10. Validated release gate
Beekeeper ships with an auditable validation posture rather than an asserted one. A coverage gate accounts for every production Go file as tested or as a reason-coded no-test allowlist entry, and fails closed on unjustified growth. A 17-harness conformance suite golden-file-tests every installer config and deny contract. A cross-platform CI matrix covers two Linux kernels, macOS, and Windows, including eBPF, eslogger, ETW, and Unix peer-cred auth. Five fuzz targets (policy engine, IPC parser, catalog parser, MCP parser, and the Sentry rule evaluator) run as a blocking release gate. What cannot be automated, a live block on each of the 16 non-Claude-Code harnesses and the gated-model sidecar end-to-end, is captured in a signed manual validation register. See the security posture for the full tiering and the documented known gaps.
11. Self-defense from day one
- Reproducible builds (
-trimpath -buildvcs=false -mod=readonly) - Keyless cosign signing via GitHub Actions OIDC
- SLSA Level 3 provenance (
slsa-github-generator@v2.1.0) - CycloneDX SBOM attached to the release
- Public threat model (
docs/THREAT-MODEL.md) - A separately-hosted, separately-keyed (Ed25519)
beekeeper-selfcompromise feed so Beekeeper can refuse to run a tampered build of itself
Download and verify
When the v1.0.0 release is published, it will ship reproducibly built, cosign-signed (keyless via GitHub Actions OIDC), with SLSA Level 3 provenance and a CycloneDX SBOM. Verify it as follows once it is available:
Release assets include signed binaries, checksums, SLSA L3 provenance, and a CycloneDX SBOM.
Verification
gh -R home-beekeeper/beekeeper release download v1.0.0 \
--pattern "checksums.txt" \
--pattern "checksums.txt.sigstore.json"cosign verify-blob \
--bundle checksums.txt.sigstore.json \
--certificate-identity-regexp '^https://github\.com/home-beekeeper/beekeeper/' \
--certificate-oidc-issuer 'https://token.actions.githubusercontent.com' \
checksums.txtslsa-verifier verify-artifact beekeeper \
--provenance-path beekeeper.intoto.jsonl \
--source-uri github.com/home-beekeeper/beekeeperA CycloneDX SBOM (*.cdx.json) is also attached to the release for dependency auditing. These commands run against the published GitHub release.
Known limitations
Beekeeper documents its gaps alongside its posture so you do not develop false
confidence. The headline limitations in this release: Hermes is structurally
fail-open; Tier-3 native tools (Kilo, Trae) are unguarded outside the MCP gateway;
only Claude Code is live-verified; binding the gateway to a non-loopback interface
exposes it over plaintext HTTP; release_age and lifecycle_script_allowlist
policy rules are accepted but not enforced; and the Sentry ingests DNS queries but
no correlation rule consumes them yet, so DNS-tunnel exfiltration is not detected.
The complete list, with mitigations, is on the
security posture page.
Internal development history
For maintainers, the work in this release was built across internal milestones (corroboration and the standalone harness, runtime behavioral hardening, the public docs site, runtime hardening II, and full-system validation). Those internal version tags are not public releases; v1.0.0 is the first and, at launch, the only available version. The parked Pollen Windows-inventory fork is tracked separately and ships on its own cadence.