Changelog¶

All notable changes to lumen-argus are documented here.

Unreleased¶

Agent — Enrollment Policy Refresh¶

Every heartbeat now re-fetches GET /api/v1/enrollment/config and atomically rewrites the .policy slice of ~/.lumen-argus/enrollment.json so admin edits to enrollment.policy.* propagate without manual re-enroll. Identity fields stay immutable. Refresh failures never flip the heartbeat return value; agents without a bearer token skip the refresh silently.
New lumen-argus-agent refresh-policy subcommand for out-of-cycle pulls. Exit 0 success, 1 network/auth/malformed error, 2 not enrolled. --json emits {"changed": bool, "policy_version": iso8601}.

Proxy & Agent — `/api/v1/build` Endpoint¶

New GET /api/v1/build on both the proxy dashboard (:8081) and the agent relay (:8070) exposes the running sidecar's build identity: {service, version, git_commit, build_id, built_at, plugins[]}. build_id = "sha256:" + sha256(sys.executable) is the authoritative comparator — two processes with the same build_id are running identical binary bytes. Sidecar-adoption callers can compare the endpoint's build_id to the build_id they expect (e.g. bundled alongside their copy of the sidecar) to decide whether a still-running process is safe to reuse, or whether the binary on disk has changed since it was spawned and the process should be respawned.
Proxy endpoint sits behind the existing dashboard auth middleware (session cookie / Bearer token) — unauthenticated requests get 401. Agent endpoint mirrors /health (loopback-only bind, no inbound auth — same security model as the rest of the relay surface).
plugins[] aggregates each loaded plugin's __build_info__ dict from its module object. Plugins that haven't shipped the attribute still appear in the list — entry-point name and version come from the plugin's distribution metadata; git_commit / build_id fall back to unknown / sha256:unknown.
Shared helper lumen_argus_core.build_info.get_build_info(service, version_fallback) owns the response shape so the proxy and agent emit identical fields. compute_build_id() (cached) hashes the running binary once per process.
_build_info.py (per-package, gitignored) is generated by scripts/generate_build_info.py at PyInstaller build time — both packages/proxy/lumen-argus.spec and packages/agent/lumen-argus-agent.spec invoke the script before Analysis() collects the package. Version comes from the package's pyproject.toml, git_commit from git rev-parse HEAD, built_at from UTC now. Dev runs (no built binary, no _build_info.py on disk) fall back gracefully — the helper uses the caller-supplied fallback version and unknown for the rest; build_id stays authoritative in either case.
ExtensionRegistry.loaded_plugin_build_infos() is the new public method that produces the plugins[] array; a parallel _loaded_plugin_modules dict (populated during load_plugins()) carries the module references needed to read each plugin's __build_info__ without a second entry-point walk.

Agent Relay — `fail_mode` in State File¶

~/.lumen-argus/relay.json now carries the fail_mode value the relay was started (or reloaded) with, alongside the existing port / bind / upstream_url / pid / started_at fields. Adopters that compare on-disk config to decide whether to reuse a running relay instead of respawning it can now see the fail-mode value without probing /health.
_reload_enrollment_config rewrites the state file after any SIGHUP-driven config change, so a policy-pushed fail_mode flip is visible on disk without restarting the relay. No-op reloads (no fields changed) skip the write.
The /health endpoint already exposed fail_mode; no change there.

Agent Uninstall — Single-Command Workstation Cleanup¶

New subcommand lumen-argus-agent uninstall [--keep-data] [--non-interactive] reverses every system change the agent made during setup / protection enable / enroll / MCP wrapping, and removes agent-owned state files from ~/.lumen-argus/. Also available as a standalone lumen-argus-uninstall binary (same flags, same JSON output) for callers that want discovery by binary name rather than subcommand.
Step ordering is load-bearing: disable_protection → undo_mcp_setup → undo_setup → data files. Running disable_protection first lets it snapshot the managed env vars before anything truncates the env file, so launchctl unsetenv on macOS sees the right names.
disable_protection() now clears the matching launchctl env vars on macOS (read-then-truncate-then-launchctl order, so a crash mid-way fails safe) and returns a new launchctl_vars_cleared field in the status dict. GUI-launched AI tools (Claude Desktop, Cursor) stop hitting the proxy immediately instead of only on new terminals.
New module lumen_argus_core.platform_env.clear_launchctl_env_vars owns the macOS launchctl seam — strict [A-Za-z_][A-Za-z0-9_]* whitelist on var names (shell-injection defence), no-op on Linux / Windows, per-name failure tolerated and excluded from the returned list. Zero new runtime dependencies (stdlib subprocess).
Orchestrator (lumen_argus_agent.uninstall.uninstall_agent) is best-effort per step — a failing step is logged with exc_info and the next step still runs. The CLI always prints structured JSON (steps, launchctl_vars_cleared, data_files_removed, errors) so tray-app / shell-script callers do not parse stderr. Exit code 0 iff errors is empty, 1 otherwise.
Out of scope (tray app / desktop installer's job, deliberately untouched): .app-path, license.key, trial.json, bin/, LaunchAgent plists, /Applications/*.app, macOS App Support / Logs directories. The ownership boundary from the uninstall spec is preserved.
Tests: test_platform_env.py (8 launchctl scenarios — empty input, non-Darwin no-op, happy path, non-zero exit, OSError, TimeoutExpired, non-identifier name rejection, missing launchctl binary), test_uninstall.py (happy path, best-effort failure — one step / two steps, --keep-data, clean-machine idempotency, JSON serialisation), plus new pins in test_setup_wizard.py for the read-before-truncate ordering and the idempotent empty-file path.

Protection Env File — Dual-Mode Body with Sticky Mode¶

New module lumen_argus_core.env_template owns the ~/.lumen-argus/env body format end-to-end: render_body(...) for writes, parse_header_managed_by(...) for reads, and the ManagedBy StrEnum (CLI / TRAY) both sides agree on. setup_wizard.write_env_file() only handles atomic I/O.
The env file has two body shapes. ManagedBy.CLI (default) emits unconditional export lines with a # lumen-argus:managed-env (cli) header — the right shape for users running the proxy from a terminal (source, pip, brew). ManagedBy.TRAY wraps the exports in a pure-shell liveness guard (.app-path marker or enrollment.json + live relay PID) with a (tray) header — the shape the desktop tray app and the enrollment flow need so a dragged-to-Trash bundle does not leave AI tools pointed at a dead proxy.
CLI surface: --managed-by {cli,tray} on protection enable in both lumen-argus and lumen-argus-agent. The tray app sidecar and lumen-argus-agent enroll pass --managed-by tray. No heuristic inference of deployment context — the invoker states the mode. The choices list is derived from the enum so a third mode propagates to both parsers automatically.
Sticky mode: write_env_file(..., managed_by=None) (the default for low-level mutators like add_env_to_env_file and setup) reads the existing file header and preserves the mode. Running setup on an enrolled machine no longer silently strips the liveness guard.
enable_protection() and protection_status() now return managed_by in their status dicts. Tray-app / dashboard consumers can verify ownership by comparing the value against what they themselves last wrote.
Zero subprocesses in the tray guard — relay.json PID is parsed with while read + case '"pid":' + ${line##*: } / ${pid%,}. Two or three stat() syscalls plus one small file read on every shell startup; well under 1 ms in either mode.
render_body uses match/case + case _: raise ValueError, so adding a future ManagedBy value hard-fails at render time rather than silently falling into the CLI shape.
Tests: 17 pure-function tests (format + parser), 3 CLI-mode real-bash tests, 5 TRAY-mode real-bash tests covering every activation-matrix row, plus test_setup_wizard.py pins for the sticky-mode invariant and the protection_status contract.

Dashboard Layering Cleanup¶

Removed all hardcoded tier branches, locked placeholders, and upsell copy from the community dashboard. Plugins now own their dashboard surface end-to-end, including any tier-aware rendering.
pipeline.js exposes a new registerPipelineAction(name) registry hook so plugins can append action options to the dropdowns. The community base set is ['log','alert','block'].
Settings page reads status.tier directly and applies a tier-<name> CSS class instead of branching on hardcoded tier values.
Removed the artificial 1-channel webhook cap. ExtensionRegistry ships with _channel_limit = None. Plugins may impose a cap via set_channel_limit(N).
GET /api/v1/notifications/types and GET /api/v1/notifications/channels no longer carry channel_limit / channel_count keys. The 409 response on POST still surfaces a plugin-imposed cap when one is set.
New regression suite tests/test_dashboard_layering.py pins the contract — fails the build if tier-aware marketing or gating leaks back into community dashboard JS, HTML, or extension defaults.

0.6.0 (2026-03-23)¶

MCP Security Hardening (Phase 2)¶

Confused deputy protection: RequestTracker tracks outbound request IDs, rejects unsolicited responses from MCP servers (FIFO eviction at 10K, seeding gate, configurable warn/block)
Tool description poisoning detection: 7 pattern categories (instruction tags, file exfiltration, cross-tool manipulation, dangerous exec, download+exec, script injection, command injection)
Tool drift/rug-pull detection: SHA-256 baselines in mcp_tool_baselines DB table, human-readable diff summary on change (description length, added text, parameter changes)
Session binding: validates tools/call against tool inventory from first tools/list response (opt-in via mcp.session_binding, 10K tool cap, configurable warn/block)
All 4 proxy modes (stdio, HTTP bridge, HTTP listener, WS bridge) wired with all security features
Config: mcp.request_tracking, mcp.unsolicited_response_action, mcp.scan_tool_descriptions, mcp.detect_drift, mcp.drift_action, mcp.session_binding, mcp.unknown_tool_action
31 new tests (935 total)

MCP Proxy Unification (Phase 1)¶

Unified lumen-argus mcp subcommand replaces mcp-wrap with flag-based transport modes
Stdio subprocess mode: lumen-argus mcp -- <command> (replaces mcp-wrap)
HTTP bridge mode: lumen-argus mcp --upstream http://... (stdio client, HTTP upstream)
HTTP reverse proxy mode: lumen-argus mcp --listen :8089 --upstream http://...
WebSocket bridge mode: lumen-argus mcp --upstream ws://...
New lumen_argus/mcp/ package with transport abstraction (scanner, transport, proxy, env_filter)
Environment variable restriction for subprocess mode — safe vars only by default, --env KEY=VALUE to add more, --no-env-filter to disable
Config: mcp.env_filter, mcp.env_allowlist in YAML
--action flag to override default action per invocation
Removed mcp_scanner.py and mcp_wrap.py (replaced by lumen_argus/mcp/ package)

Rules Performance Optimization (Phase 1)¶

Aho-Corasick pre-filter: single O(n) pass narrows 1,700+ rules to ~15 candidates per field
Literal extraction from regex patterns via sre_parse (handles alternation, (?i), escapes)
Early termination: stop after first match when action is block
Hot-first ordering: rules sorted by hit_count DESC on reload
In-memory hit count accumulation with 60s periodic batch flush to DB
Graceful fallback: if pyahocorasick unavailable, sequential scan (current behavior)
Pro metrics hook: extensions.set_rule_metrics_collector(collector)
pyahocorasick>=2.0 added as dependency (pre-built wheels for all major platforms)
Benchmark: 184KB/53 rules: 236ms → 36ms (under 50ms target)
Parallel rule batching: when enabled (Pro toggle) and candidates > 50, groups rules by detector category and evaluates concurrently via ThreadPoolExecutor
RulesDetector.set_parallel(bool) for runtime toggle from Pipeline page
Pipeline page: "Parallel rule evaluation" toggle in Advanced section
Config: pipeline.parallel_batching in YAML + DB overrides, applied at startup and SIGHUP
Accelerator factory hook: extensions.set_accelerator_factory(factory) — Pro/Enterprise can swap Aho-Corasick with a custom pre-filter engine (e.g., Hyperscan/Vectorscan); graceful fallback if factory raises

0.5.0 (2026-03-22)¶

Async Proxy (Phase 1)¶

New async_proxy.py — aiohttp.web-based proxy server replacing ThreadingHTTPServer
Non-blocking I/O with coroutine per request instead of thread per request
CPU-bound scanning runs in thread pool via asyncio.to_thread()
SSE streaming via StreamResponse + async iteration
Built-in connection pooling via aiohttp.ClientSession with TCPConnector
Retry on connection errors (aiohttp.ClientConnectionError)
SIGHUP reload via loop.add_signal_handler()
All existing functionality preserved: MCP detection, history stripping, session extraction, response scanning
aiohttp>=3.9 added as dependency
18 new integration tests, 867 total tests passing

WebSocket Unification (Phase 2)¶

WebSocket relay moved from standalone port 8083 into the main proxy on port 8080
Client connects via ws://localhost:8080/ws?url=ws://target
Uses aiohttp.web.WebSocketResponse + ClientSession.ws_connect() — no separate websockets package needed
websockets dependency removed — aiohttp handles both HTTP and WebSocket
SSRF protection: only ws:// and wss:// schemes allowed
Origin validation configurable via websocket.allowed_origins
SIGHUP reloads scanner config without server restart (same port)
Dependencies reduced from 3 (pyyaml, aiohttp, websockets) to 2 (pyyaml, aiohttp)

WebSocket Connection Lifecycle Hooks¶

Each WebSocket connection assigned a unique connection_id (UUID)
Extension hook fires on open, finding_detected, and close events
ws_connections SQLite table tracks connection history (target URL, origin, duration, frame counts, findings, close code)
Default community hook records to analytics store; Pro can override for richer analytics
Hook calls run in thread pool via asyncio.to_thread() — no event loop blocking
Connection data included in daily retention cleanup
WebSocket findings now enforced by policy (block closes connection, alert logs + continues)
REST API: GET /api/v1/ws/connections, GET /api/v1/ws/stats

Cleanup¶

Removed dead proxy.py (old ThreadingHTTPServer) and pool.py (old connection pool)
Removed legacy test files (test_proxy_integration.py, test_pool.py)
Session tracking tests updated to use async_proxy module

Thread Safety (Python 3.13+ free-threaded / no-GIL)¶

ScannerPipeline.scan() snapshots shared references (_allowlist, _policy, _decoder, _detectors) under _reload_lock before scanning — prevents torn reads during reload()
ScannerPipeline.reload() swaps references under the same _reload_lock
AsyncArgusProxy._active_requests uses threading.Lock for atomic increment/decrement
Safe for PYTHON_GIL=0 (PEP 703) — all shared mutable state is properly synchronized

0.4.0 (2026-03-20)¶

Rules Engine¶

DB-backed detection rules replace hardcoded Python pattern files
rules table in SQLite: name, pattern, detector, severity, action, enabled, tier, source, description, tags, validator
CLI: lumen-argus rules import/export/list/validate subcommands
Auto-import 43 community rules on first serve (opt out: --no-default-rules)
RulesDetector: loads compiled patterns from DB, license-gated for Pro rules
Validator registry: luhn, ssn_range, iban_mod97, exclude_private_ips
Capture-group-aware matching: group(1) preferred over group(0)
Pipeline uses RulesDetector when DB has rules, falls back to hardcoded detectors
YAML custom_rules: reconciled to DB on startup/SIGHUP (Kubernetes-style)
SIGHUP reloads rules from DB via RulesDetector.reload()
Configurable: rules.auto_import: false to skip auto-import

Cross-Request Deduplication¶

3-layer dedup architecture eliminates redundant scanning of conversation history
Layer 1: Content fingerprinting — per-session SHA-256 hash set skips already-scanned fields before detectors run
Layer 2: Finding-level TTL cache — session-scoped (detector, type, matched_value_hash, session_id) suppresses duplicate DB writes
Layer 3: Store-level unique constraint — content_hash column with UNIQUE(content_hash, session_id) index, INSERT OR IGNORE
Configurable via dedup: config section (conversation_ttl_minutes, finding_ttl_minutes, max_conversations, max_hashes_per_conversation)
Background cleanup schedulers for both content fingerprint and finding caches
All findings remain in ScanResult for policy enforcement — dedup only affects DB recording
Notification dispatcher still receives all findings (has its own cooldown)
seen_count column tracks how many requests included each finding (dashboard shows ×N badge)
content_hash uses hash(matched_value) — no collisions between different secrets with same masked preview
bump_seen_counts() increments existing findings when conversation history is re-sent

Value Hashing¶

HMAC-SHA-256 hash of matched secret values stored as value_hash in findings DB
Enables cross-session secret tracking without persisting raw secrets
Auto-generated 32-byte key at ~/.lumen-argus/hmac.key (0600 permissions)
Full 64 hex chars output (256 bits, no truncation)
Configurable: analytics.hash_secrets (default: true)
Dashboard detail panel shows "Value Hash" field when populated

0.3.0 (2026-03-19)¶

Session Tracking¶

Per-request session context extraction: account_id, session_id, device_id, source_ip, working_directory, git_branch, os_platform, client_name, api_key_hash
Claude Code metadata.user_id JSON string parsing (account_uuid, device_id, session_id)
System prompt field extraction for working directory, git branch, and OS platform
User-Agent parsing for client tool identification
Derived fingerprint (fp:<hash>) fallback when no provider session ID
9 session columns in findings DB (no migration — direct schema update)
GET /api/v1/sessions endpoint with grouped finding counts
GET /api/v1/findings supports session_id and account_id filters
Dashboard: Session, Account, Device, Branch, Client columns in findings table
Session filter dropdown with clickable session IDs
api_key_hash excluded from JSONL audit log (stored in analytics DB only)
post_scan hook signature updated with session= kwarg (backward-compatible)

0.2.0 (2026-03-19)¶

Notification Channels¶

Notifications page unlocked in community dashboard (freemium: 1 channel any type, unlimited with Pro)
7 channel types available: webhook, email, Slack, Teams, PagerDuty, OpsGenie, Jira
Kubernetes-style YAML reconciliation — YAML is fully authoritative
Dashboard: CRUD for dashboard-managed channels, read-only for YAML channels with badge
Source-aware buttons: YAML channels get Toggle + Test, dashboard channels get full CRUD
Audit trail: created_by / updated_by fields on all channel operations
Channel limit enforcement with atomic count + insert under same lock
Sensitive field masking in API responses (webhook URLs, passwords, API keys)

Observability¶

/health endpoint: added uptime field, extension hook for Pro enrichment
/metrics endpoint: extension hook for Pro to append Prometheus metrics
OpenTelemetry tracing hook: set_trace_request_hook() wraps full request lifecycle
Span attributes: provider, body.size, findings.count, action, scan.duration_ms
All hooks fully guarded — exceptions never break requests

CI/CD¶

Docker smoke test in GitHub Actions (build image, verify /health)
Workflow permissions hardened (contents: read)
Concurrency: cancel in-progress runs on new push
README: CI status badge

Documentation¶

MkDocs Material site at lumen-argus.github.io/lumen-argus/docs/
Landing page at lumen-argus.github.io/lumen-argus/ with comparison table
GitHub Pages auto-deploy via Actions

Open Source¶

LICENSE (MIT, Artem Senenko)
SECURITY.md (responsible disclosure policy)
CONTRIBUTING.md (setup, constraints, commit format)
Issue templates (bug report, feature request, security link)
PR template (stdlib-only, security checklist)
Repo topics, description, homepage

0.1.0 (2026-03-17)¶

Initial release of the lumen-argus Community Edition.

Detection¶

34+ secret detection patterns (AWS, GitHub, Anthropic, OpenAI, Google, Stripe, Slack, JWT, database URLs, PEM keys, generic passwords)
Shannon entropy analysis (>4.5 bits/char near secret keywords)
PII detection with validation: email, SSN (range validation), credit card (Luhn), phone, IP (excludes private), IBAN (MOD-97), passport
Proprietary code detection: file pattern blocklist, keyword detection
Custom regex rules in config (unlimited, SIGHUP reloadable)
Duplicate finding deduplication

Proxy¶

Transparent HTTP proxy for AI coding tools (Claude Code, Copilot, Cursor)
Provider auto-detection: Anthropic, OpenAI, Gemini
SSE streaming passthrough via read1()
Connection pooling with idle-read timeout (proxy.timeout) and separate TCP connect timeout (proxy.connect_timeout)
Backpressure via max_connections semaphore (default: 50)
Graceful drain on shutdown (drain_timeout)
Custom CA bundle support for corporate proxies
/health JSON endpoint
/metrics Prometheus endpoint

Scanning¶

lumen-argus scan for file and stdin scanning
--diff mode for git pre-commit hooks (staged changes or ref diff)
--baseline / --create-baseline for known finding suppression
Differentiated exit codes (0=clean, 1=block, 2=alert, 3=log)
JSON output format for CI pipelines

Configuration¶

Bundled YAML parser (no PyYAML dependency)
Global + project-level config with merge semantics
Hot-reload via SIGHUP with config diff logging
Allowlists for secrets, PII, and paths (exact + glob)
Per-detector action overrides

Logging¶

Rotating application log (~/.lumen-argus/logs/lumen-argus.log)
Secure file permissions (0o600, atomic creation)
Startup summary at INFO (version, Python, OS, config, detectors)
Block/redact actions at INFO, slow scans at WARNING
lumen-argus logs export --sanitize for safe log sharing
Thread-safe JSONL audit log with retention policy

Extensions¶

Plugin system via Python entry points
Hooks: pre_request, post_scan, evaluate, config_reload, redact
Public API: pipeline.reload(), registry.set_proxy_server()

Security¶

Localhost-only binding (127.0.0.1, enforced at runtime)
matched_value never written to disk
Async-signal-safe shutdown handlers
TLS certificate verification with custom CA support

Changelog¶

Unreleased¶

Agent — Enrollment Policy Refresh¶

Proxy & Agent — /api/v1/build Endpoint¶

Agent Relay — fail_mode in State File¶

Agent Uninstall — Single-Command Workstation Cleanup¶

Protection Env File — Dual-Mode Body with Sticky Mode¶

Dashboard Layering Cleanup¶

0.6.0 (2026-03-23)¶

MCP Security Hardening (Phase 2)¶

MCP Proxy Unification (Phase 1)¶

Rules Performance Optimization (Phase 1)¶

0.5.0 (2026-03-22)¶

Async Proxy (Phase 1)¶

WebSocket Unification (Phase 2)¶

WebSocket Connection Lifecycle Hooks¶

Cleanup¶

Thread Safety (Python 3.13+ free-threaded / no-GIL)¶

0.4.0 (2026-03-20)¶

Rules Engine¶

Cross-Request Deduplication¶

Value Hashing¶

0.3.0 (2026-03-19)¶

Session Tracking¶

0.2.0 (2026-03-19)¶

Notification Channels¶

Observability¶

CI/CD¶

Documentation¶

Open Source¶

0.1.0 (2026-03-17)¶

Detection¶

Proxy¶

Scanning¶

Configuration¶

Logging¶

Extensions¶

Security¶

Proxy & Agent — `/api/v1/build` Endpoint¶

Agent Relay — `fail_mode` in State File¶