Changelog¶
All notable changes to lumen-argus are documented here.
0.6.0 (2026-03-23)¶
MCP Security Hardening (Phase 2)¶
- Confused deputy protection:
RequestTrackertracks outbound request IDs, rejects unsolicited responses from MCP servers (FIFO eviction at 10K, seeding gate, configurable warn/block) - Tool description poisoning detection: 7 pattern categories (instruction tags, file exfiltration, cross-tool manipulation, dangerous exec, download+exec, script injection, command injection)
- Tool drift/rug-pull detection: SHA-256 baselines in
mcp_tool_baselinesDB table, human-readable diff summary on change (description length, added text, parameter changes) - Session binding: validates
tools/callagainst tool inventory from firsttools/listresponse (opt-in viamcp.session_binding, 10K tool cap, configurable warn/block) - All 4 proxy modes (stdio, HTTP bridge, HTTP listener, WS bridge) wired with all security features
- Config:
mcp.request_tracking,mcp.unsolicited_response_action,mcp.scan_tool_descriptions,mcp.detect_drift,mcp.drift_action,mcp.session_binding,mcp.unknown_tool_action - 31 new tests (935 total)
MCP Proxy Unification (Phase 1)¶
- Unified
lumen-argus mcpsubcommand replacesmcp-wrapwith flag-based transport modes - Stdio subprocess mode:
lumen-argus mcp -- <command>(replacesmcp-wrap) - HTTP bridge mode:
lumen-argus mcp --upstream http://...(stdio client, HTTP upstream) - HTTP reverse proxy mode:
lumen-argus mcp --listen :8089 --upstream http://... - WebSocket bridge mode:
lumen-argus mcp --upstream ws://... - New
lumen_argus/mcp/package with transport abstraction (scanner, transport, proxy, env_filter) - Environment variable restriction for subprocess mode — safe vars only by default,
--env KEY=VALUEto add more,--no-env-filterto disable - Config:
mcp.env_filter,mcp.env_allowlistin YAML --actionflag to override default action per invocation- Removed
mcp_scanner.pyandmcp_wrap.py(replaced bylumen_argus/mcp/package)
Rules Performance Optimization (Phase 1)¶
- Aho-Corasick pre-filter: single O(n) pass narrows 1,700+ rules to ~15 candidates per field
- Literal extraction from regex patterns via
sre_parse(handles alternation, (?i), escapes) - Early termination: stop after first match when action is
block - Hot-first ordering: rules sorted by
hit_count DESCon reload - In-memory hit count accumulation with 60s periodic batch flush to DB
- Graceful fallback: if
pyahocorasickunavailable, sequential scan (current behavior) - Pro metrics hook:
extensions.set_rule_metrics_collector(collector) pyahocorasick>=2.0added as dependency (pre-built wheels for all major platforms)- Benchmark: 184KB/53 rules: 236ms → 36ms (under 50ms target)
- Parallel rule batching: when enabled (Pro toggle) and candidates > 50, groups rules by detector category and evaluates concurrently via ThreadPoolExecutor
RulesDetector.set_parallel(bool)for runtime toggle from Pipeline page- Pipeline page: "Parallel rule evaluation" toggle in Advanced section
- Config:
pipeline.parallel_batchingin YAML + DB overrides, applied at startup and SIGHUP - Accelerator factory hook:
extensions.set_accelerator_factory(factory)— Pro/Enterprise can swap Aho-Corasick with a custom pre-filter engine (e.g., Hyperscan/Vectorscan); graceful fallback if factory raises
0.5.0 (2026-03-22)¶
Async Proxy (Phase 1)¶
- New
async_proxy.py—aiohttp.web-based proxy server replacingThreadingHTTPServer - Non-blocking I/O with coroutine per request instead of thread per request
- CPU-bound scanning runs in thread pool via
asyncio.to_thread() - SSE streaming via
StreamResponse+ async iteration - Built-in connection pooling via
aiohttp.ClientSessionwithTCPConnector - Retry on connection errors (
aiohttp.ClientConnectionError) - SIGHUP reload via
loop.add_signal_handler() - All existing functionality preserved: MCP detection, history stripping, session extraction, response scanning
aiohttp>=3.9added as dependency- 18 new integration tests, 867 total tests passing
WebSocket Unification (Phase 2)¶
- WebSocket relay moved from standalone port 8083 into the main proxy on port 8080
- Client connects via
ws://localhost:8080/ws?url=ws://target - Uses
aiohttp.web.WebSocketResponse+ClientSession.ws_connect()— no separatewebsocketspackage needed websocketsdependency removed —aiohttphandles both HTTP and WebSocket- SSRF protection: only
ws://andwss://schemes allowed - Origin validation configurable via
websocket.allowed_origins - SIGHUP reloads scanner config without server restart (same port)
- Dependencies reduced from 3 (
pyyaml,aiohttp,websockets) to 2 (pyyaml,aiohttp)
WebSocket Connection Lifecycle Hooks¶
- Each WebSocket connection assigned a unique
connection_id(UUID) - Extension hook fires on
open,finding_detected, andcloseevents ws_connectionsSQLite table tracks connection history (target URL, origin, duration, frame counts, findings, close code)- Default community hook records to analytics store; Pro can override for richer analytics
- Hook calls run in thread pool via
asyncio.to_thread()— no event loop blocking - Connection data included in daily retention cleanup
- WebSocket findings now enforced by policy (block closes connection, alert logs + continues)
- REST API:
GET /api/v1/ws/connections,GET /api/v1/ws/stats
Cleanup¶
- Removed dead
proxy.py(old ThreadingHTTPServer) andpool.py(old connection pool) - Removed legacy test files (
test_proxy_integration.py,test_pool.py) - Session tracking tests updated to use
async_proxymodule
Thread Safety (Python 3.13+ free-threaded / no-GIL)¶
ScannerPipeline.scan()snapshots shared references (_allowlist,_policy,_decoder,_detectors) under_reload_lockbefore scanning — prevents torn reads duringreload()ScannerPipeline.reload()swaps references under the same_reload_lockAsyncArgusProxy._active_requestsusesthreading.Lockfor atomic increment/decrement- Safe for
PYTHON_GIL=0(PEP 703) — all shared mutable state is properly synchronized
0.4.0 (2026-03-20)¶
Rules Engine¶
- DB-backed detection rules replace hardcoded Python pattern files
rulestable in SQLite: name, pattern, detector, severity, action, enabled, tier, source, description, tags, validator- CLI:
lumen-argus rules import/export/list/validatesubcommands - Auto-import 43 community rules on first
serve(opt out:--no-default-rules) RulesDetector: loads compiled patterns from DB, license-gated for Pro rules- Validator registry:
luhn,ssn_range,iban_mod97,exclude_private_ips - Capture-group-aware matching:
group(1)preferred overgroup(0) - Pipeline uses
RulesDetectorwhen DB has rules, falls back to hardcoded detectors - YAML
custom_rules:reconciled to DB on startup/SIGHUP (Kubernetes-style) - SIGHUP reloads rules from DB via
RulesDetector.reload() - Configurable:
rules.auto_import: falseto skip auto-import
Cross-Request Deduplication¶
- 3-layer dedup architecture eliminates redundant scanning of conversation history
- Layer 1: Content fingerprinting — per-session SHA-256 hash set skips already-scanned fields before detectors run
- Layer 2: Finding-level TTL cache — session-scoped
(detector, type, matched_value_hash, session_id)suppresses duplicate DB writes - Layer 3: Store-level unique constraint —
content_hashcolumn withUNIQUE(content_hash, session_id)index,INSERT OR IGNORE - Configurable via
dedup:config section (conversation_ttl_minutes,finding_ttl_minutes,max_conversations,max_hashes_per_conversation) - Background cleanup schedulers for both content fingerprint and finding caches
- All findings remain in
ScanResultfor policy enforcement — dedup only affects DB recording - Notification dispatcher still receives all findings (has its own cooldown)
seen_countcolumn tracks how many requests included each finding (dashboard shows ×N badge)content_hashuseshash(matched_value)— no collisions between different secrets with same masked previewbump_seen_counts()increments existing findings when conversation history is re-sent
Value Hashing¶
- HMAC-SHA-256 hash of matched secret values stored as
value_hashin findings DB - Enables cross-session secret tracking without persisting raw secrets
- Auto-generated 32-byte key at
~/.lumen-argus/hmac.key(0600 permissions) - Full 64 hex chars output (256 bits, no truncation)
- Configurable:
analytics.hash_secrets(default: true) - Dashboard detail panel shows "Value Hash" field when populated
0.3.0 (2026-03-19)¶
Session Tracking¶
- Per-request session context extraction: account_id, session_id, device_id, source_ip, working_directory, git_branch, os_platform, client_name, api_key_hash
- Claude Code metadata.user_id JSON string parsing (account_uuid, device_id, session_id)
- System prompt field extraction for working directory, git branch, and OS platform
- User-Agent parsing for client tool identification
- Derived fingerprint (
fp:<hash>) fallback when no provider session ID - 9 session columns in findings DB (no migration — direct schema update)
GET /api/v1/sessionsendpoint with grouped finding countsGET /api/v1/findingssupportssession_idandaccount_idfilters- Dashboard: Session, Account, Device, Branch, Client columns in findings table
- Session filter dropdown with clickable session IDs
api_key_hashexcluded from JSONL audit log (stored in analytics DB only)post_scanhook signature updated withsession=kwarg (backward-compatible)
0.2.0 (2026-03-19)¶
Notification Channels¶
- Notifications page unlocked in community dashboard (freemium: 1 channel any type, unlimited with Pro)
- 7 channel types available: webhook, email, Slack, Teams, PagerDuty, OpsGenie, Jira
- Kubernetes-style YAML reconciliation — YAML is fully authoritative
- Dashboard: CRUD for dashboard-managed channels, read-only for YAML channels with badge
- Source-aware buttons: YAML channels get Toggle + Test, dashboard channels get full CRUD
- Audit trail:
created_by/updated_byfields on all channel operations - Channel limit enforcement with atomic count + insert under same lock
- Sensitive field masking in API responses (webhook URLs, passwords, API keys)
Observability¶
/healthendpoint: addeduptimefield, extension hook for Pro enrichment/metricsendpoint: extension hook for Pro to append Prometheus metrics- OpenTelemetry tracing hook:
set_trace_request_hook()wraps full request lifecycle - Span attributes:
provider,body.size,findings.count,action,scan.duration_ms - All hooks fully guarded — exceptions never break requests
CI/CD¶
- Docker smoke test in GitHub Actions (build image, verify
/health) - Workflow permissions hardened (
contents: read) - Concurrency: cancel in-progress runs on new push
- README: CI status badge
Documentation¶
- MkDocs Material site at lumen-argus.github.io/lumen-argus/docs/
- Landing page at lumen-argus.github.io/lumen-argus/ with comparison table
- GitHub Pages auto-deploy via Actions
Open Source¶
- LICENSE (MIT, Artem Senenko)
- SECURITY.md (responsible disclosure policy)
- CONTRIBUTING.md (setup, constraints, commit format)
- Issue templates (bug report, feature request, security link)
- PR template (stdlib-only, security checklist)
- Repo topics, description, homepage
0.1.0 (2026-03-17)¶
Initial release of the lumen-argus Community Edition.
Detection¶
- 34+ secret detection patterns (AWS, GitHub, Anthropic, OpenAI, Google, Stripe, Slack, JWT, database URLs, PEM keys, generic passwords)
- Shannon entropy analysis (>4.5 bits/char near secret keywords)
- PII detection with validation: email, SSN (range validation), credit card (Luhn), phone, IP (excludes private), IBAN (MOD-97), passport
- Proprietary code detection: file pattern blocklist, keyword detection
- Custom regex rules in config (unlimited, SIGHUP reloadable)
- Duplicate finding deduplication
Proxy¶
- Transparent HTTP proxy for AI coding tools (Claude Code, Copilot, Cursor)
- Provider auto-detection: Anthropic, OpenAI, Gemini
- SSE streaming passthrough via
read1() - Connection pooling with idle timeout
- Backpressure via
max_connectionssemaphore - Graceful drain on shutdown (
drain_timeout) - Custom CA bundle support for corporate proxies
/healthJSON endpoint/metricsPrometheus endpoint
Scanning¶
lumen-argus scanfor file and stdin scanning--diffmode for git pre-commit hooks (staged changes or ref diff)--baseline/--create-baselinefor known finding suppression- Differentiated exit codes (0=clean, 1=block, 2=alert, 3=log)
- JSON output format for CI pipelines
Configuration¶
- Bundled YAML parser (no PyYAML dependency)
- Global + project-level config with merge semantics
- Hot-reload via SIGHUP with config diff logging
- Allowlists for secrets, PII, and paths (exact + glob)
- Per-detector action overrides
Logging¶
- Rotating application log (
~/.lumen-argus/logs/lumen-argus.log) - Secure file permissions (0o600, atomic creation)
- Startup summary at INFO (version, Python, OS, config, detectors)
- Block/redact actions at INFO, slow scans at WARNING
lumen-argus logs export --sanitizefor safe log sharing- Thread-safe JSONL audit log with retention policy
Extensions¶
- Plugin system via Python entry points
- Hooks: pre_request, post_scan, evaluate, config_reload, redact
- Public API:
pipeline.reload(),registry.set_proxy_server()
Security¶
- Localhost-only binding (127.0.0.1, enforced at runtime)
matched_valuenever written to disk- Async-signal-safe shutdown handlers
- TLS certificate verification with custom CA support