Ephyr Security Whitepaper #

Secure Infrastructure Access for AI Agents

Version 0.2 | March 2026

Table of Contents #

Executive Summary
Threat Landscape: AI Agents and Infrastructure
Architecture Deep Dive
Trust Model
- 4.1 Three-Tier Signing Hierarchy
- 4.2 Key Custody and Lifecycle
- 4.3 Broker Compromise Containment
- 4.4 Key Rotation
Authentication
- 5.1 Layer 1: SO_PEERCRED (Kernel-Verified Identity)
- 5.2 Layer 2: Session Tokens
- 5.3 Layer 3: Dashboard Token
- 5.4 Layer 4: MCP API Keys (bcrypt)
- 5.5 Layer 5: SSH Certificates
- 5.6 Auth Cache
Authorization
- 6.1 Policy Engine
- 6.2 Eight-Step Evaluation Pipeline
- 6.3 RBAC Model
- 6.4 Capability Envelopes
- 6.5 Wildcard Resolution
Token Architecture
- 7.1 CTT-E Format
- 7.2 Delegation Certificates
- 7.3 Validation Chain
- 7.4 Identity URN Scheme
- 7.5 ULID Task Identifiers
Revocation
- 8.1 Epoch Watermark Model
- 8.2 Lineage-Walk Validation
- 8.3 Cascading Revocation
- 8.4 Background Garbage Collection
- 8.5 Comparison with Traditional Approaches
Audit and Correlation
- 9.1 Task-Scoped Audit
- 9.2 Lineage Tracking
- 9.3 Structured Logging
- 9.4 Real-Time Event Hub
- 9.5 Activity Ring Buffer
Cryptographic Choices
- 10.1 Ed25519 for All Signing Operations
- 10.2 bcrypt for API Key Storage
- 10.3 SHA-256 for Auth Cache Keys
- 10.4 ULID for Task Identifiers
- 10.5 Random Value Generation
Network Security
- 11.1 UID-Based nftables Isolation
- 11.2 Credential Injection Model
- 11.3 CIDR Allow/Deny Policy
- 11.4 DNS Resolution Security
Hardening Guide
- 12.1 Systemd Sandboxing Directives
- 12.2 Production Deployment Recommendations
- 12.3 Target Host Hardening
- 12.4 Monitoring and Alerting
Comparison with Existing Solutions
- 13.1 SPIFFE/SPIRE
- 13.2 HashiCorp Vault SSH Secrets Engine
- 13.3 Traditional PAM and SSH Key Management
- 13.4 Summary Matrix
Known Limitations and Future Work
- 14.1 Current Limitations
- 14.2 Planned Improvements
- 14.3 Research Directions
Appendix A: Trust Boundary Diagram
Appendix B: Threat Enumeration Summary
Appendix C: Dependency Analysis

1. Executive Summary #

Ephyr is an open-source, Go-based privileged access broker purpose-built for AI agent infrastructure. It replaces static SSH keys and long-lived API tokens with ephemeral, policy-governed credentials – enabling organizations to grant infrastructure access to autonomous AI agents without expanding their permanent attack surface.

The Problem #

AI agents – LLM-driven automation systems – are increasingly deployed to manage infrastructure: running diagnostics, restarting services, querying databases, and orchestrating multi-step operations. Traditional access models fail for these workloads:

Static SSH keys grant persistent access with no expiration, no audit trail tied to the task that initiated the operation, and no mechanism to scope access to the current operation’s requirements.
Long-lived API tokens are bearer credentials that, once leaked, grant full access until manually rotated. Agent environments – where prompts may be manipulated and tool outputs may be observed by untrusted parties – make credential leakage a primary concern.
Human-centric IAM assumes interactive approval workflows, password challenges, and session durations measured in hours. Agent operations require sub-second credential issuance, task-scoped lifetimes measured in minutes, and fully automated policy evaluation.

The Solution #

Ephyr provides a single MCP (Model Context Protocol) endpoint through which agents request access. Every operation flows through an eight-step authorization pipeline, results in an ephemeral credential with a default five-minute lifetime, and is recorded in a structured audit log correlated to the specific task that initiated it. Agents never handle infrastructure credentials directly – the broker generates keypairs, obtains signed certificates, establishes connections, and injects credentials on the agent’s behalf.

Security Value Proposition #

Property	Mechanism
No persistent secrets	Ephemeral Ed25519 keypairs generated per-request, never written to disk
Bounded blast radius	5-minute default certificate TTL; delegation certs limit broker authority to 1 hour
Defense in depth	5 authentication layers, 3-tier trust hierarchy, process-level isolation
Unforgeable identity	SO_PEERCRED kernel verification for co-located agents
Zero credential exposure	HTTP proxy injects credentials invisibly; agents cannot extract tokens
Task-scoped audit	Every operation tied to a ULID-identified task with hierarchical lineage
Minimal dependencies	22,832 lines of Go; 3 external libraries (gorilla/websocket, x/crypto, yaml.v3)

Intended Audience #

This whitepaper is written for security engineers, infrastructure architects, and compliance teams evaluating Ephyr for production deployment. It provides a thorough description of the security architecture, cryptographic decisions, threat model, and known limitations – sufficient for an informed risk assessment.

2. Threat Landscape: AI Agents and Infrastructure #

2.1 The Agent Identity Problem #

Human users have well-established identity frameworks: SSO providers, MFA devices, biometric authentication, and session management tied to browser cookies or desktop credentials. AI agents operate outside these frameworks. An agent is a process – potentially running in a container, a VM, or as a subprocess of another tool – with no inherent identity beyond its OS-level UID and the API keys it was provisioned with.

This creates several security challenges unique to agent workloads:

Identity bootstrapping. How does an agent prove it is the entity it claims to be? Unlike a human who can enter a password or present a hardware token, an agent’s identity must be established through some combination of its execution context (where it runs, as which user) and provisioned credentials.

Credential scope. A human administrator who SSH-keys into a server has a mental model of what they intend to do and (ideally) stops there. An agent executing a multi-step plan may request access to resources it does not actually need for the current task, particularly if the plan was generated by a language model susceptible to prompt injection.

Audit attribution. When three agents access the same server in the same minute, traditional syslog entries that record only the SSH principal and source IP are insufficient. Security teams need to trace each command back to the specific task, agent, and triggering event.

Credential lifecycle. Agents operate continuously. A credential issued to an agent at 2 AM may still be valid at 2 PM, long after the task that required it has completed. Traditional session timeouts assume human interaction patterns; agent operations are bursty and task-scoped.

2.2 The Blast Radius Problem #

When a human account is compromised, the blast radius is bounded by that human’s privileges. When an agent account is compromised, the blast radius is the union of all resources the agent could ever access – because agents are typically provisioned with broad permissions to handle diverse tasks.

Consider a monitoring agent with SSH access to twenty servers. If its credential is compromised, the attacker has immediate access to all twenty servers for the full lifetime of the credential. With a static SSH key, that lifetime is indefinite.

Ephyr addresses this by making every credential:

Ephemeral – default 5-minute lifetime, maximum 24-hour hard cap
Task-scoped – the capability envelope constrains which targets, roles, services, and HTTP methods a specific task token can authorize
Auditable – each credential is tied to a ULID-identified task with a hierarchical lineage chain

2.3 The Credential Management Problem #

Deploying agents at scale requires distributing credentials to each agent instance. Every distribution point is an exfiltration opportunity. API keys in environment variables, SSH keys in mounted volumes, and tokens in configuration files create a sprawling credential surface.

Ephyr’s design principle is that agents should never hold infrastructure credentials. The broker holds all credentials (SSH CA key, API tokens, service passwords) in a single, hardened location. Agents interact exclusively through the broker’s MCP endpoint, receiving only the results of operations – never the credentials that enabled them.

2.4 The Prompt Injection Threat #

AI agents are uniquely susceptible to prompt injection: malicious content in tool outputs or user messages can manipulate the agent’s behavior. An agent that holds credentials and can construct arbitrary SSH commands or HTTP requests is a powerful target for this attack.

Ephyr mitigates prompt injection impact through:

Capability envelopes that upper-bound what any task token can authorize, regardless of what the agent requests
Policy-engine validation that rejects requests outside the agent’s configured permissions
HTTP proxy credential injection that prevents agents from extracting or redirecting credentials to unintended destinations
CIDR-based network policy that restricts proxy destinations to approved address ranges

3. Architecture Deep Dive #

3.1 Three-Process Model #

Ephyr implements strict privilege separation through three independent OS-level processes, each running with the minimum privileges required for its function:

                    +-------------------------------------------------+
                    |            EPHYR HOST (VM/LXC)                 |
                    |                                                 |
  +-----------+     |  +----------+   AF_UNIX    +----------------+  |
  |           |     |  |          |<============>|                |  |
  | Agent CLI |=====|=>| Signer   | SO_PEERCRED  |    Broker      |  |
  | (UID 1000)|     |  | (UID 999)| (UID check)  |   (UID 999)   |  |
  |           |     |  |          |              |                |  |
  +-----------+     |  | CA Key   |              | Policy Engine  |  |
       ||           |  | AF_UNIX  |              | Session Mgr    |  |
       ||           |  | only     |              | MCP Server     |  |
       ||           |  +----------+              | HTTP Proxy     |  |
       ||           |                            | Dashboard      |  |
  nftables          |                            | Audit Logger   |  |
  BLOCKS            |                            | Event Hub      |  |
  direct            |                            +----+---+---+---+  |
  access            |                                 |   |   |      |
                    +---------------------------------|---|---|-------+
                                                      |   |   |
                                            SSH certs |   |   | MCP
                                           (ephemeral)|   |   | federation
                                                      v   |   v
                                                  +------+ | +--------+
                                                  |Target| | |Remote  |
                                                  |Hosts | | |MCP     |
                                                  +------+ | |Servers |
                                                           | +--------+
                                                           v
                                                     +---------+
                                                     | HTTP    |
                                                     | Services|
                                                     +---------+

Why three processes? The signer holds the CA private key – the single most valuable secret in the system. By isolating it in a dedicated process with no network access, Ephyr ensures that a broker compromise cannot extract the CA key. The signer validates every IPC connection via SO_PEERCRED, accepting only connections from UID 999 (the broker). Even if an attacker achieves code execution within the broker process, they cannot open a TCP socket to exfiltrate the key (the signer has no network) and they cannot impersonate the broker UID to extract the key via IPC (the kernel enforces UID verification).

Why not a single process with separate goroutines? Goroutine isolation is not a security boundary. A compromised goroutine within the same process can read any memory in the process address space, including the CA private key. OS-level process isolation, combined with systemd sandboxing and network restrictions, provides defense-in-depth that in-process isolation cannot.

3.2 Unix Socket IPC and Peer Credential Verification #

All inter-process communication uses Unix domain sockets with SO_PEERCRED verification. This is a Linux kernel mechanism that provides unforgeable caller identity:

  Agent Process (UID 1000)
       |
       | connect(/run/ephyr/broker.sock)
       |
       v
  Kernel: getsockopt(SO_PEERCRED)
       |
       | Returns: { uid: 1000, gid: 1000, pid: 42381 }
       |           (populated from kernel process table)
       v
  Broker Process
       |
       | Maps UID 1000 -> "claude" agent in policy.yaml
       | Proceeds with request evaluation

Properties of SO_PEERCRED:

Unforgeable from userspace. The UID, GID, and PID are read directly from the kernel process table. A process cannot claim a different UID without actually running as that UID (which requires root or CAP_SETUID).
Verified on every connection. Each new Unix socket connection triggers a fresh SO_PEERCRED extraction. There is no session token to steal or replay at this layer.
Zero configuration. No secrets, keys, or passwords need to be distributed to agents for local authentication. The agent’s OS UID is its identity.

The signer applies the same mechanism with a stricter policy: it extracts SO_PEERCRED from every connection and rejects any UID that does not match the EPHYR_BROKER_UID environment variable (999 in production). This means even root-privilege agents on the same host cannot directly communicate with the signer – only the broker can.

IPC Protocol. The signer uses a one-shot, newline-delimited JSON protocol. Each connection handles exactly one request-response exchange, then closes. This eliminates state management complexity and prevents connection pooling attacks. The protocol is deliberately simpler than JSON-RPC 2.0 – there is no multiplexing, no streaming, and no session state.

// Request (broker -> signer)
{"action":"sign","public_key":"ssh-ed25519 AAAA...","principals":["agent-read"],
 "duration":"5m","key_id":"claude@webserver/read","force_command":""}

// Response (signer -> broker)
{"certificate":"<base64>","serial":"a1b2c3d4e5f60718",
 "expires_at":"2026-03-10T14:35:00Z"}

Four actions are supported:

sign – sign an SSH user certificate
sign_delegation – sign a delegation certificate for broker token authority
root_public_key – return the signer’s Ed25519 public key for pinning
ping – health check

3.3 Systemd Sandboxing #

Both the signer and broker run as systemd services with extensive security directives. The signer, as the most security-critical component, receives the strictest sandbox.

Signer Sandbox (systemd-analyze security score: 1.9/10):

Directive	Security Effect
`ProtectSystem=strict`	Entire filesystem is read-only except explicitly allowed paths
`MemoryDenyWriteExecute=yes`	Prevents mapping memory as both writable and executable; blocks JIT compilation, shellcode injection, and most memory corruption exploits
`RestrictAddressFamilies=AF_UNIX`	Kernel-level prohibition on creating TCP, UDP, or raw sockets. The signer process cannot communicate over the network under any circumstances.
`CapabilityBoundingSet=` (empty)	All Linux capabilities dropped. Process cannot perform any privileged operations.
`NoNewPrivileges=yes`	Cannot escalate privileges via setuid/setgid binaries
`SystemCallFilter=@system-service`	Allowlisted system calls only; disallowed syscalls return EPERM
`ProtectHome=yes`	Home directories are invisible to the process
`PrivateTmp=yes`	Isolated /tmp namespace
`PrivateDevices=yes`	No access to /dev (no device files)
`ProtectKernelTunables=yes`	/proc/sys and /sys are read-only
`ProtectKernelModules=yes`	Cannot load kernel modules
`ReadOnlyPaths=/etc/ephyr`	CA key directory is read-only (key loaded at startup)
`ReadWritePaths=/run/ephyr`	Only the socket directory is writable
`RestrictNamespaces=yes`	Cannot create new namespaces (no container escape)
`LockPersonality=yes`	Cannot change execution domain
`RestrictRealtime=yes`	Cannot acquire realtime scheduling priority
`ProtectHostname=yes`	Cannot change hostname
`RestrictSUIDSGID=yes`	Cannot set SUID/SGID bits on files

The practical effect: even if an attacker achieves code execution within the signer process, they cannot:

Open any network connection (AF_UNIX restriction)
Write to any file outside /run/ephyr (ProtectSystem=strict)
Execute injected shellcode (MemoryDenyWriteExecute)
Escalate to root (NoNewPrivileges, empty CapabilityBoundingSet)
Load a kernel module (ProtectKernelModules)
Create a container to escape (RestrictNamespaces)

Broker Sandbox:

The broker has a slightly relaxed sandbox to allow TCP listeners (dashboard and MCP ports) and write access to logs and configuration:

Directive	Value
`RestrictAddressFamilies`	`AF_UNIX AF_INET AF_INET6` (Unix + TCP)
`ReadWritePaths`	`/run/ephyr /var/log/ephyr /var/lib/ephyr`
`CapabilityBoundingSet`	Empty (no capabilities)

All other hardening directives match the signer. The broker can listen on TCP ports but cannot perform privileged operations, access home directories, or modify the system filesystem.

3.4 Network Isolation via nftables #

When agents run co-located on the broker host (the recommended deployment model), nftables UID-based rules prevent agents from bypassing the broker:

                   +-------------------------------------+
                   |           EPHYR HOST               |
                   |                                     |
  Agent (UID 1000) |  nftables OUTPUT chain:             |
  can ONLY reach:  |    uid 1000 -> REJECT               |
    - broker.sock  |    (for all TCP/UDP to backend IPs) |
    - localhost     |                                     |
                   |  Broker (UID 999)                   |
                   |    -> unrestricted outbound         |
                   +-------------------------------------+

The nftables rules operate at Layer 3/4, matching the source UID of outbound packets. An agent process running as UID 1000 is blocked from establishing TCP connections to any backend IP range (RFC 1918 or configured targets). All infrastructure access must flow through the broker’s Unix socket, where policy evaluation, credential injection, and audit logging occur.

This is a defense-in-depth measure. Even if an agent discovers a backend service’s IP address and port, it cannot connect directly. The broker is the sole network gateway for agent operations.

4. Trust Model #

4.1 Three-Tier Signing Hierarchy #

Ephyr implements a three-tier signing hierarchy that bounds the authority of each layer:

  +-------------------+
  |  Root CA (Signer) |    Tier 0: Long-lived Ed25519 key
  |  /etc/ephyr/     |    Scope: Can sign anything (SSH certs + delegation certs)
  |  ca_key            |    Exposure: Zero network, UID-restricted IPC, systemd sandbox
  +--------+----------+
           |
           | signs (infrequent IPC, ~1x per hour)
           v
  +-------------------+
  | Delegation Cert   |    Tier 1: Broker's ephemeral Ed25519 key + signed cert
  | (Broker memory)   |    Scope: Can sign CTT-E tokens within envelope bounds
  +--------+----------+    Lifetime: 1 hour (configurable)
           |
           | signs (no IPC needed, local Ed25519)
           v
  +-------------------+
  | CTT-E Task Token  |    Tier 2: JWT with EdDSA signature
  | (Agent memory)    |    Scope: Single task, bounded by capability envelope
  +-------------------+    Lifetime: Up to 1 hour (configurable, max of delegation TTL)

Why three tiers? The root CA key is the most sensitive secret. If the broker had to contact the signer for every task token, the IPC channel would be a high-frequency target. By introducing a delegation certificate, the broker can sign task tokens locally using an ephemeral key that is itself signed by the root CA. This reduces IPC to approximately once per hour (delegation rotation), while the signer validates every delegation request against its peer credential check.

4.2 Key Custody and Lifecycle #

Root CA Key:

Generated once via ssh-keygen -t ed25519
Stored at /etc/ephyr/ca_key with permissions 0600
Read by the signer at process startup; never re-read
The signer validates file permissions at load time (rejects anything other than 0600)
Backed up offline and encrypted; loss requires re-provisioning all target hosts with a new CA public key

Delegation Key:

Generated by the broker using crypto/ed25519.GenerateKey with crypto/rand.Reader as entropy source
Exists only in broker process memory; never written to disk
Replaced every 50 minutes (default refreshAt); the old key is retained as prevKey for graceful rollover until tokens signed with it expire

Task Token Keys:

CTT-E tokens are signed with the current delegation key
No additional key generation per token – the delegation key signs all tokens during its lifetime

4.3 Broker Compromise Containment #

A critical design goal: broker compromise must not yield the CA key and must be bounded in scope.

What an attacker gains with broker code execution:

The ability to request SSH certificates from the signer, but only within the constraints of the policy engine (the signer enforces maximum TTL of 24 hours independently)
Access to plaintext service credentials in /var/lib/ephyr/services.json and federated MCP credentials in /var/lib/ephyr/remotes.json
The ability to sign CTT-E task tokens using the current delegation key
Read/write access to the audit log (within the broker’s ReadWritePaths)
The ability to toggle hosts, services, and remotes on/off

What an attacker does NOT gain:

The CA private key (isolated in the signer process, UID-restricted IPC)
The ability to extend delegation certificate lifetime beyond the current cert’s expiry (at most ~1 hour)
Persistence – once the broker process is restarted, the delegation key rotates and the attacker’s key is invalidated
The ability to modify the policy file (/etc/ephyr/policy.yaml is read-only to the broker process via ProtectSystem=strict)

Temporal bound: A broker compromise is bounded by the delegation certificate’s remaining TTL. After rotation, the old delegation key is discarded and any tokens signed with it become unverifiable once the cert expires. The maximum window is the delegation TTL (default 1 hour).

4.4 Key Rotation #

Delegation Certificate Rotation:

The DelegationManager implements automatic key rotation:

At startup, the broker generates a fresh Ed25519 keypair and requests a delegation certificate from the signer via IPC
A background goroutine triggers rotation at the refreshAt interval (default 50 minutes, configurable)
On rotation:
- New Ed25519 keypair generated
- New delegation certificate requested from signer
- Previous key moved to prevKey for graceful rollover
- Token issuer updated with new signing key
- Metrics incremented (DelegationRotations)
If rotation fails (signer unreachable), the broker continues using the existing key and retries at the next interval

Root CA Key Rotation (manual):

Root key rotation is a manual operational procedure:

Generate a new Ed25519 CA key
Deploy the new CA public key to all target hosts’ TrustedUserCAKeys alongside the old key (both are trusted during the transition)
Restart the signer with the new CA key
After all outstanding certificates signed by the old key have expired, remove the old CA public key from target hosts

5. Authentication #

Ephyr implements five independent authentication layers, each providing a different guarantee. No single layer’s compromise grants full system access.

  +-----------------------------------------------------------------+
  | Layer 5: SSH Certificates (Ed25519, ephemeral, 5-min TTL)       |
  |   Authenticates broker to target hosts                          |
  +-----------------------------------------------------------------+
  | Layer 4: MCP API Keys (bcrypt + auth cache)                     |
  |   Authenticates remote agents to broker                         |
  +-----------------------------------------------------------------+
  | Layer 3: Dashboard Token (constant-time comparison)             |
  |   Authenticates human operators to dashboard                    |
  +-----------------------------------------------------------------+
  | Layer 2: Session Tokens (256-bit, crypto/rand)                  |
  |   Authenticates agent sessions after UID verification           |
  +-----------------------------------------------------------------+
  | Layer 1: SO_PEERCRED (kernel-verified UID)                      |
  |   Authenticates co-located agents at the OS level               |
  +-----------------------------------------------------------------+

5.1 Layer 1: SO_PEERCRED (Kernel-Verified Identity) #

The primary authentication mechanism for co-located agents. When an agent connects to /run/ephyr/broker.sock, the broker extracts the caller’s credentials using the getsockopt(SO_PEERCRED) system call:

func GetPeerCred(conn net.Conn) (uid uint32, pid int32, err error) {
    unixConn := conn.(*net.UnixConn)
    raw, _ := unixConn.SyscallConn()
    var cred *syscall.Ucred
    raw.Control(func(fd uintptr) {
        cred, _ = syscall.GetsockoptUcred(int(fd),
            syscall.SOL_SOCKET, syscall.SO_PEERCRED)
    })
    return cred.Uid, cred.Pid, nil
}

The kernel populates the Ucred structure from its process table. This cannot be spoofed from userspace – it requires CAP_SETUID (which the agent does not have, since NoNewPrivileges=yes is set on all systemd units) or a kernel exploit.

The extracted UID is mapped to an agent identity in policy.yaml. If no agent is registered for the UID, the connection is rejected before any further processing.

Socket permissions: /run/ephyr/broker.sock is created with mode 0660 and group ephyr-agents. Only processes in that group can connect.

5.2 Layer 2: Session Tokens #

After UID verification, agents establish a session by calling POST /v1/session. The broker generates a 256-bit token using crypto/rand.Read:

func generateToken() (string, error) {
    b := make([]byte, 32)     // 256 bits
    _, err := rand.Read(b)    // CSPRNG
    return hex.EncodeToString(b), err  // 64 hex characters
}

Session properties:

One active session per agent (creating a new session invalidates the previous token)
Cross-checked against SO_PEERCRED: the session UID must match the connection UID on every request
No session expiry (sessions live until agent creates a new one or the broker restarts)
Token masked in API responses: [:8]...[-8:]

Security rationale: Sessions provide a second factor beyond UID verification. If an attacker gains access to the Unix socket but not the agent’s session token, they can see that an agent exists but cannot execute operations. Combined with SO_PEERCRED, an attacker would need both the correct UID and a valid session token.

5.3 Layer 3: Dashboard Token #

The dashboard at port 8553 is protected by a static token compared using crypto/subtle.ConstantTimeCompare:

if subtle.ConstantTimeCompare([]byte(provided), []byte(expected)) != 1 {
    // reject
}

Properties:

Constant-time comparison prevents timing-based token extraction
CORS restricted to same-origin
Static assets (CSS, JS) are exempt from authentication
Token masked in audit logs: first4...last4

Dashboard privacy features:

SensitiveField components auto-hide content after 5 seconds
CanvasSecret renders tokens on HTML canvas (not text-selectable, not present in DOM)
Page content blurs on tab switch (prevents over-the-shoulder observation)
Print CSS hides all sensitive content

5.4 Layer 4: MCP API Keys (bcrypt) #

Remote agents connecting to the MCP server at port 8554 authenticate via the X-API-Key HTTP header. Keys are stored as bcrypt hashes in policy.yaml:

agents:
  claude:
    uid: 1000
    api_key_hash: "$2a$10$..."  # bcrypt hash

bcrypt properties relevant to Ephyr:

Cost factor 10 (default) – each comparison takes approximately 100ms on modern hardware
Inherently constant-time per comparison (no timing side-channel on hash match vs. mismatch)
Salt embedded in the hash – no separate salt storage required
72-byte input limit (Go’s bcrypt implementation truncates at 72 bytes)

The MCPAuthenticator iterates through all registered agents on each authentication attempt, comparing the provided key against each agent’s bcrypt hash. This is O(n) in the number of agents, with each comparison taking ~100ms. For typical deployments (fewer than 10 agents), this is acceptable. For larger deployments, the auth cache (Section 5.6) mitigates the cost.

5.5 Layer 5: SSH Certificates #

SSH certificates authenticate the broker to target hosts. For each command execution:

Broker generates an ephemeral Ed25519 keypair in memory
Broker requests a signed certificate from the signer, specifying:
- Principals (the SSH user, e.g., agent-read)
- Duration (default 5 minutes, clamped by policy)
- Key ID format: ephyr:{agent}@{target}/{role}:{serial_hex}
Signer validates the request, applies its own duration cap (24-hour hard maximum), and returns the signed certificate
Broker establishes the SSH connection using the ephemeral keypair and certificate
After command execution, the keypair and certificate are discarded

Certificate properties:

Serial: Cryptographically random 8-byte uint64 (crypto/rand)
Clock skew grace: 30 seconds before now
Extensions: permit-pty, permit-port-forwarding, permit-agent-forwarding
Critical options: force-command when target policy specifies one

Target-side validation: Target hosts are configured with TrustedUserCAKeys pointing to the Ephyr CA public key and AuthorizedPrincipalsFile mapping each role account to its principal. The certificate’s principal (e.g., agent-read) must match a line in the target user’s principals file.

5.6 Auth Cache #

To avoid repeated bcrypt comparisons for the same API key, the MCPAuthenticator maintains an in-memory cache keyed on SHA-256(apiKey):

  API Key arrives
       |
       v
  SHA-256(key) -> cache lookup
       |
  +----+----+
  |         |
  HIT     MISS
  |         |
  return   bcrypt compare against all agents
  cached     |
  agent    match found?
             |
          +--+--+
          |     |
         YES    NO
          |     |
       cache  reject
       result

Cache properties:

Key: SHA-256(apiKey) – the raw API key is never stored in the cache
Value: *MCPAgent struct + expiry timestamp
TTL: 60 seconds (configurable via EPHYR_AUTH_CACHE_TTL)
Invalidation: entire cache cleared on agent add/remove
Thread-safe: sync.RWMutex with separate lock from agent registry
Observable: CacheStats() returns hit/miss counters

Security considerations:

SHA-256 is used only as a cache key, not for authentication. The security guarantee comes from bcrypt – the cache merely avoids repeating the expensive bcrypt comparison.
Cache TTL bounds the window during which a revoked key remains valid (maximum 60 seconds by default).
Setting TTL to 0 disables caching entirely.

6. Authorization #

6.1 Policy Engine #

Authorization is governed by a declarative YAML policy file at /etc/ephyr/policy.yaml. The policy defines four sections:

global:
  max_active_certs: 10
  default_ttl: "5m"
  max_ttl: "30m"
  rate_limit:
    requests_per_window: 10
    window_seconds: 60

agents:
  claude:
    uid: 1000
    max_concurrent_certs: 3
    api_key_hash: "$2a$10$..."
    inherits: [monitoring]
    ssh:
      dockerhost:
        roles: [read, operator]
        auto_approve: true
    services:
      "*":
        methods: [GET, POST]
    dashboard: viewer

templates:
  monitoring:
    ssh:
      "*":
        roles: [read]
    services:
      grafana:
        methods: [GET]

roles:
  read:
    principal: "agent-read"
  operator:
    principal: "agent-op"
  admin:
    principal: "agent-admin"

targets:
  dockerhost:
    host: "192.168.100.100"
    port: 22
    allowed_roles: [read, operator, admin]
    max_ttl: "30m"
    auto_approve: true

Policy is hot-reloaded via SIGHUP (systemctl reload ephyr-broker) without service restart. Invalid configurations are rejected, and the broker continues using the previous valid policy.

6.2 Eight-Step Evaluation Pipeline #

Every SSH certificate request passes through eight sequential checks. Failure at any step short-circuits the pipeline:

  Certificate Request
       |
  [1] Agent exists by UID? -------> DENY: "unknown agent UID"
       |
  [2] Target exists in policy? ---> DENY: "unknown target"
       |
  [3] Role in target's            > DENY: "role not allowed on target"
      allowed_roles?
       |
  [4] Duration clamped to
      min(requested,
          target.max_ttl,
          global.max_ttl)  -------> (silent clamp, no denial)
       |
  [5] Agent active certs           > DENY: "at concurrent cert limit"
      < max_concurrent_certs?
       |
  [6] Duplicate cert for same
      agent+target+role? ----------> (auto-revoke old cert)
       |
  [7] Global active certs         > DENY: "global limit reached"
      < max_active_certs?
       |
  [8] Auto-approve check ---------> APPROVE or PENDING

Design notes:

Step 4 silently clamps duration rather than rejecting. This prevents agents from being denied access due to requesting a longer-than-allowed TTL – the policy simply enforces its maximum.
Step 6 auto-revokes duplicate certificates. Agents retry on failure, and stale certificates consuming concurrency slots would create deadlocks. The deduplication key is {agentUID}:{target}:{role}.
Step 5 runs before Step 6. Expired certificates are purged before the pipeline begins (CleanExpired()), so stale entries do not affect concurrency counts.

6.3 RBAC Model #

Ephyr v0.2 introduces role-based access control (RBAC) with per-agent permissions across three proxy paths:

SSH Access: Per-agent, per-target role restrictions. An agent can be allowed read on all targets but operator only on specific targets.

Service Access: Per-agent HTTP method restrictions per service. An agent can be allowed GET on all services but POST only on specific services.

Remote Access: Per-agent tool restrictions per federated MCP server. An agent can be allowed all tools on one remote but only specific tools on another.

Dashboard Access: Four levels: none, viewer, operator, admin. Controls which dashboard operations an agent’s API key can perform.

Template Inheritance: Agents can inherit from templates via the inherits field. Templates are merged left-to-right (first template wins per key), then agent-specific settings overlay templates. Agent-level settings always override inherited templates.

Legacy Mode: Agents with no RBAC fields configured receive LegacyMode=true, which grants full access to all targets, services, and remotes. This ensures backward compatibility with existing deployments.

Resolution algorithm:

  1. Check if agent has ANY RBAC fields (ssh, services, remotes,
     dashboard, inherits)
  2. If no RBAC fields: LegacyMode=true (full access)
  3. If RBAC fields present:
     a. Merge templates left-to-right (first-wins per key)
     b. Overlay agent-specific settings (always wins)
     c. Intersect SSH roles with target's allowed_roles
        (agent cannot exceed target's role list)

6.4 Capability Envelopes #

Task tokens (CTT-E) carry a capability envelope that defines the upper bound of what the token can authorize:

{
  "envelope": {
    "targets":  ["dockerhost", "hugoblog"],
    "roles":    ["read", "operator"],
    "services": ["grafana", "gitea"],
    "remotes":  ["demo-tools"],
    "methods":  ["GET", "POST"]
  }
}

The envelope is computed from the agent’s RBAC permissions at token issuance time. Every brokered request must satisfy both the capability envelope AND the policy engine – the envelope is a pre-check that runs before full policy evaluation.

Subset enforcement: When delegation tokens (CTT-D, planned for Phase 2b) are introduced, child tokens must have envelopes that are strict subsets of their parent’s envelope. The IsSubsetOf() method validates this at issuance time, ensuring that delegation never amplifies privileges.

6.5 Wildcard Resolution #

Policy supports the "*" wildcard for targets, services, and remotes. However, tokens never contain wildcards. At token issuance time, wildcards are resolved to explicit literal arrays:

  Policy: agent.ssh = { "*": { roles: [read] } }
  Targets in policy: [dockerhost, hugoblog, mandrake-rack]

  Resolved envelope.targets = ["dockerhost", "hugoblog", "mandrake-rack"]

This ensures that tokens are self-describing – a security reviewer can inspect a token and see exactly which resources it authorizes, without needing to resolve wildcards against the current policy.

7. Token Architecture #

7.1 CTT-E Format #

CTT-E (Ephyr Task Token - Execution) is a compact JWT with EdDSA (Ed25519) signatures:

  Header.Payload.Signature

  Header (base64url):
  {
    "alg": "EdDSA",        // Ed25519 signature algorithm
    "typ": "CTT-E",        // Token type identifier
    "kid": "<deleg-cert-id>" // Links to delegation cert for verification
  }

  Payload (base64url):
  {
    "iss": "ephyr:<broker-id>",     // Issuer (broker instance)
    "sub": "claude",                  // Subject (agent name)
    "aud": "ephyr-broker",           // Audience
    "iat": 1741868400,                // Issued At (Unix timestamp)
    "exp": 1741870200,                // Expires At (Unix timestamp)
    "jti": "cte_01JQXYZ...",          // Token ID (ULID-based)

    "task": {
      "id":           "01JQXYZ...",   // ULID
      "root_id":      "01JQXYZ...",   // Root task ULID
      "parent_id":    "",              // Empty for root tasks
      "depth":        0,               // Nesting depth
      "lineage":      ["01JQXYZ..."], // Full ancestor chain
      "initiated_by": "ephyr:apikey:ak_claud",
      "description":  "Check disk usage on dockerhost"
    },

    "envelope": {
      "targets":  ["dockerhost"],
      "roles":    ["read"],
      "services": ["grafana"],
      "remotes":  [],
      "methods":  ["GET"]
    }
  }

  Signature: Ed25519(base64url(header) + "." + base64url(payload))

Design decisions:

EdDSA over RS256/ES256: Ed25519 provides 128-bit security with 32-byte keys and 64-byte signatures. Signing and verification are fast (tens of microseconds) and constant-time. No nonce generation required (unlike ECDSA, where nonce reuse is catastrophic).
ULID-based JTI: Token IDs use the cte_ prefix plus a ULID, providing lexicographic sortability by creation time and cryptographic randomness.
Explicit timestamps: iat and exp use Unix timestamps (integer seconds) for unambiguous time comparison across systems.

7.2 Delegation Certificates #

Delegation certificates authorize the broker to sign CTT-E tokens. They are not JWTs – they use a simpler canonical JSON format signed by the root CA:

  Payload (deterministic JSON with sorted keys):
  {
    "broker_id":  "broker-01",
    "cert_id":    "a1b2c3d4...",
    "expires_at": 1741872000,
    "issued_at":  1741868400,
    "public_key": "<base64 Ed25519 public key bytes>"
  }

  Signature: Ed25519(canonical_json_bytes)

Deterministic serialization: The payload is serialized with json.Marshal over a map[string]interface{} with sorted keys. This ensures that signature verification produces the same byte sequence regardless of the platform.

Lifecycle:

Broker generates ephemeral Ed25519 keypair
Broker sends public key to signer via sign_delegation IPC
Signer validates broker UID (SO_PEERCRED), applies TTL cap (maximum 24 hours), signs canonical payload
Broker stores cert and private key in memory
Token issuer updated to use new signing key
Rotation occurs every 50 minutes (default)

7.3 Validation Chain #

Every CTT-E token is validated through an eight-step chain:

  CTT-E Token String
       |
  [1] Parse JWT (split on dots, base64url decode)
       |
  [2] Extract kid from header
      Look up DelegationCert by kid
       |
  [3] Verify delegation cert signature
      against pinned root public key -----> REJECT: "delegation cert
       |                                     verification failed"
  [4] Verify delegation cert not expired -> REJECT: "delegation cert
       |                                     expired"
  [5] Verify CTT-E signature against
      delegated public key ---------------> REJECT: "invalid token
       |                                     signature"
  [6] Verify CTT-E not expired -----------> REJECT: "token expired"
       |
  [7] Verify audience == "ephyr-broker" -> REJECT: "unexpected audience"
       |
  [8] Return parsed TaskClaims (valid)

Steps 3-5 form the trust chain: the root public key (pinned at broker startup) validates the delegation certificate, and the delegation certificate’s public key validates the CTT-E token. If the root key is rotated, the broker must be restarted to pin the new key.

7.4 Identity URN Scheme #

Agent identity in CTT-E tokens uses a URN scheme that encodes the authentication method:

Authentication Method	Identity URN
Local Unix socket (SO_PEERCRED)	`ephyr:local:uid:1000`
MCP API key	`ephyr:apikey:ak_claud` (first 6 chars of agent name)

This allows audit consumers to distinguish between co-located agents (authenticated via kernel UID verification) and remote agents (authenticated via bcrypt API keys), which have different trust properties.

7.5 ULID Task Identifiers #

Task IDs use ULIDs (Universally Unique Lexicographically Sortable Identifiers) – 128-bit identifiers encoded as 26 Crockford Base32 characters:

  01JQXYZ1234567890ABCDEFGH
  |------||----------------|
  48-bit    80-bit crypto
  Unix ms   random (crypto/rand)
  timestamp

Properties:

Lexicographic sort order matches creation time
Collision probability: 2^(-80) per millisecond (80 bits of randomness)
Crockford Base32 excludes I, L, O, U (reduces visual ambiguity)
Timestamp component enables extraction of creation time without database lookup (ULIDTime())
26-character string is shorter than UUID (36 characters) while providing better sortability

Implementation detail: Ephyr implements ULID generation natively (no external dependency) using crypto/rand for the random component, ensuring cryptographic-quality randomness.

8. Revocation #

8.1 Epoch Watermark Model #

Traditional revocation approaches (CRLs, OCSP) maintain a list of every revoked credential. This is O(n) in the number of revocations and requires distribution of revocation lists to validators.

Ephyr uses epoch watermarks: when a task is revoked, its ID is recorded with a timestamp. Any token whose iat (issued-at) is at or before the watermark for any task in its lineage is considered revoked.

  Revocation Map:
  +----------+--------------------------+
  | Task ID  | Revoked At               |
  +----------+--------------------------+
  | 01JQABC  | 2026-03-10T14:30:00.000Z |
  | 01JQDEF  | 2026-03-10T14:35:00.000Z |
  +----------+--------------------------+

  Token validation:
    For each task_id in token.lineage:
      if revocationMap[task_id].revokedAt >= token.iat:
        REJECT (token issued before or at revocation time)

8.2 Lineage-Walk Validation #

When checking a token for revocation, the RevocationMap walks the token’s entire lineage array:

func (r *RevocationMap) CheckLineage(lineage []string, issuedAt time.Time) error {
    r.mu.RLock()
    defer r.mu.RUnlock()
    for _, taskID := range lineage {
        if watermark, ok := r.watermarks[taskID]; ok {
            if !issuedAt.After(watermark) {
                return fmt.Errorf("task %s was revoked at %s", taskID, watermark)
            }
        }
    }
    return nil
}

This is O(depth), where depth is the number of ancestors in the task hierarchy. In practice, task hierarchies are shallow (typically fewer than 5 levels), making the check nearly constant-time.

8.3 Cascading Revocation #

Because tokens carry their full lineage (from root task to leaf), revoking a parent task automatically invalidates all child tokens. No explicit enumeration of children is required:

  Root Task A ─────> Child B ─────> Grandchild C
  (ULID: 01JQ001)   (ULID: 01JQ002) (ULID: 01JQ003)

  Lineage of C: ["01JQ001", "01JQ002", "01JQ003"]

  Revoking A (watermark: 01JQ001 -> now):
    - Token for A: lineage[0] is revoked -> REJECTED
    - Token for B: lineage[0] is revoked -> REJECTED
    - Token for C: lineage[0] is revoked -> REJECTED

  All descendants invalidated with a single map entry.

8.4 Background Garbage Collection #

The RevocationMap runs a background goroutine every 60 seconds that removes watermarks older than the maximum task TTL. Once watermark_time + maxTTL has passed, any token issued before the watermark has already expired naturally – the watermark is no longer needed.

  GC cycle:
    cutoff = now - maxTaskTTL
    for each (taskID, watermark) in map:
      if watermark < cutoff:
        delete(taskID)

This ensures the revocation map size is bounded by the number of revocations within one maxTTL window (default 1 hour), not by the total historical revocation count.

8.5 Comparison with Traditional Approaches #

Property	CRL	OCSP	Epoch Watermark
Storage growth	O(total revocations)	O(1) per check	O(active revocations)
Distribution	Push CRL to all validators	Real-time query per check	Local map, no distribution
Cascading revocation	Must enumerate all children	Must enumerate all children	Automatic via lineage walk
Offline operation	Yes (cached CRL)	No (requires responder)	Yes (local map)
GC complexity	Manual CRL trimming	N/A	Automatic (TTL-based)
Latency	File read	Network round-trip	HashMap lookup

The epoch watermark model is particularly well-suited to Ephyr’s use case because task tokens are short-lived (minutes to hours) and hierarchical. The combination of automatic cascading and TTL-based garbage collection keeps the revocation map small and fast.

9. Audit and Correlation #

9.1 Task-Scoped Audit #

Every operation in Ephyr is logged to /var/log/ephyr/audit.json as newline-delimited JSON. Each entry includes the agent name, task ID (when available), target, role, and a details map with operation-specific context.

Key event types:

Event Type	Severity	Description
`startup`	INFO	Broker started with configuration summary
`shutdown`	INFO	Broker shutting down
`cert_issued`	INFO	SSH certificate signed and delivered
`cert_denied`	WARN	Certificate request rejected by policy
`cert_pending`	INFO	Certificate request awaiting approval
`cert_revoked`	WARN	Certificate revoked (manual or auto-duplicate)
`session_start`	INFO	Agent session created
`rate_limited`	WARN	Request rejected by rate limiter
`policy_reload`	INFO	Policy hot-reloaded via SIGHUP
`host_toggle`	WARN	Host enabled/disabled via dashboard
`mcp_exec`	INFO	Command executed via MCP tool
`mcp_session_create`	INFO	Persistent SSH session opened
`mcp_session_close`	INFO	Persistent SSH session closed
`http_proxy`	INFO	HTTP request proxied to service
`http_proxy_denied`	WARN	HTTP proxy request rejected by policy
`task_create`	INFO	New task created with CTT-E token
`task_revoke`	WARN	Task revoked (watermark set)
`anomaly_detected`	ALERT	Anomalous behavior detected

9.2 Lineage Tracking #

When task identity is enabled, every audit entry includes the task ID and its lineage. This allows security teams to reconstruct the complete chain of operations from a root task to any leaf operation:

{
  "timestamp": "2026-03-10T14:30:00Z",
  "severity": "INFO",
  "event_type": "mcp_exec",
  "agent": "claude",
  "target": "dockerhost",
  "role": "read",
  "details": {
    "task_id": "01JQXYZ...",
    "command": "df -h",
    "exit_code": "0",
    "duration_ms": "245"
  }
}

Queries like “show me every operation performed by task 01JQXYZ and its subtasks” become simple log filters. This is critical for incident response: if a task produces unexpected behavior, the full execution history can be reconstructed from the audit log.

9.3 Structured Logging #

All audit entries follow a consistent schema:

type AuditEvent struct {
    Severity  Severity          // INFO, WARN, ERROR, ALERT
    EventType string            // event category
    Agent     string            // agent name
    Target    string            // target host or service
    Role      string            // SSH role
    Serial    string            // certificate serial (hex)
    Duration  string            // certificate TTL
    Reason    string            // policy decision reason
    Details   map[string]string // operation-specific key-value pairs
}

JSON format is chosen for machine parseability – logs can be ingested directly by SIEM systems, Grafana Loki, Elasticsearch, or any JSON-line compatible log aggregator.

Logrotate manages 30-day retention with automatic rotation.

9.4 Real-Time Event Hub #

All audit events are simultaneously broadcast via WebSocket to connected dashboard clients. The event hub uses a non-blocking send pattern with a 64-message buffer per client:

  Audit Event
       |
       +---> audit.json (file, append-only)
       |
       +---> WebSocket Hub
              |
              +---> Dashboard Client 1 [buffer: 64 msgs]
              +---> Dashboard Client 2 [buffer: 64 msgs]
              +---> Dashboard Client N [buffer: 64 msgs]

If a client falls behind (buffer full), messages are dropped rather than blocking the broker. This provides a real-time secondary audit trail – an attacker who compromises the broker and truncates the audit file cannot retroactively erase events that were already delivered to connected dashboard clients.

Keep-alive: 30-second ping, 60-second pong timeout.

9.5 Activity Ring Buffer #

A 10,000-entry circular buffer provides fast in-memory analytics without the overhead of database queries:

O(1) insert – constant-time regardless of buffer size
Bounded memory – oldest entries silently overwritten on wrap
Filterable queries – agent, type, target, service, time range, errors-only
Per-agent statistics – totals, last active time, error rates
Dashboard integration – top targets, top services, recent entries

The ring buffer is complementary to the audit log. The audit log is the authoritative record; the ring buffer provides fast operational queries for the dashboard.

10. Cryptographic Choices #

10.1 Ed25519 for All Signing Operations #

Every signing operation in Ephyr uses Ed25519 (Edwards-curve Digital Signature Algorithm over Curve25519):

Root CA certificate signing
Delegation certificate signing
CTT-E task token signing
Ephemeral SSH keypair generation

Why Ed25519:

Property	Benefit
Fixed key size (32 bytes)	No key size decisions; immune to “small key” misconfigurations
Deterministic signatures	No random nonce required; eliminates nonce-reuse catastrophes (cf. Sony PS3 ECDSA)
Constant-time operations	Immune to timing side-channels by design
Fast verification (~70us)	Suitable for per-request token validation on hot paths
Universal SSH support	OpenSSH 6.5+ supports Ed25519; no compatibility issues on modern systems
Compact signatures (64 bytes)	Minimal overhead in JWT tokens and SSH certificates

What was NOT chosen:

RSA-2048/4096: Larger keys, slower operations, more complex implementation surface. No advantage for this use case.
ECDSA (P-256): Requires random nonce per signature; nonce reuse reveals the private key. EdDSA’s deterministic nonces eliminate this risk class entirely.
Post-quantum (ML-DSA/Dilithium): Ephyr’s short token lifetimes (minutes) and ephemeral keys mean harvest-now-decrypt-later is not a meaningful threat. Post-quantum signatures are dramatically larger (2-4KB) and would significantly increase token size for no practical benefit at this time.

10.2 bcrypt for API Key Storage #

MCP API keys are stored as bcrypt hashes with cost factor 10:

hash, err := bcrypt.GenerateFromPassword([]byte(key), bcrypt.DefaultCost)

Why bcrypt over alternatives:

Algorithm	Resistance	Reason for/against
bcrypt (chosen)	GPU-resistant, adaptive cost	Standard library support in Go (`x/crypto`); well-understood; cost factor adjustable
Argon2id	Memory-hard, GPU-resistant	Superior for password storage, but Go’s `x/crypto` bcrypt is sufficient for API keys; Argon2 requires additional tuning parameters (memory, iterations, parallelism)
scrypt	Memory-hard	Harder to tune correctly; bcrypt is simpler and adequate
SHA-256	Not key-stretching	Insufficient; fast hashes enable brute-force at billions of attempts per second

bcrypt’s ~100ms cost per comparison also provides implicit rate limiting on authentication attempts – an attacker sending high volumes of invalid keys consumes significant CPU, which is observable via monitoring.

10.3 SHA-256 for Auth Cache Keys #

The auth cache keys API key fingerprints using SHA-256:

func apiKeyFingerprint(apiKey string) string {
    h := sha256.Sum256([]byte(apiKey))
    return hex.EncodeToString(h[:])
}

SHA-256 is used here exclusively as a cache key derivation function – it is not used for authentication. The security requirement is collision resistance (different API keys should produce different cache keys), not brute-force resistance. SHA-256 provides 128-bit collision resistance, which is more than sufficient.

10.4 ULID for Task Identifiers #

ULIDs were chosen over UUIDs for task identification:

Property	UUID v4	ULID
Sortability	Random (no natural order)	Lexicographic = chronological
Encoding	36 chars (hex + hyphens)	26 chars (Crockford Base32)
Timestamp extractable	No	Yes (48-bit millisecond precision)
Randomness	122 bits	80 bits
Database index performance	Poor (random inserts)	Good (monotonic inserts)

The 80-bit randomness component (from crypto/rand) provides adequate collision resistance for Ephyr’s use case – tasks are created at human time scales (seconds to minutes apart), not at high-frequency machine rates.

10.5 Random Value Generation #

All random values in Ephyr are generated using crypto/rand.Read, which reads from the operating system’s CSPRNG (/dev/urandom on Linux). No math/rand is used anywhere in the codebase for security-relevant values.

Value	Size	Format	Purpose
Session tokens	256 bits (32 bytes)	Hex (64 chars)	Agent session authentication
Certificate serials	64 bits (8 bytes)	Hex (16 chars)	SSH certificate identity
Request IDs	128 bits (16 bytes)	Hex (32 chars)	Request correlation
Grant IDs	128 bits (16 bytes)	Hex (32 chars)	Access grant identity
ULID randomness	80 bits (10 bytes)	Crockford Base32	Task ID entropy
Delegation cert IDs	128 bits (16 bytes)	Hex (32 chars)	Delegation cert identity

11. Network Security #

11.1 UID-Based nftables Isolation #

The recommended deployment model places agents on the same host as the broker, with nftables rules preventing direct backend access:

  # Reference nftables rules for agent isolation

  table inet ephyr {
    chain output {
      type filter hook output priority 0; policy accept;

      # Allow broker (UID 999) unrestricted outbound
      meta skuid 999 accept

      # Block agent UIDs from reaching backend networks
      meta skuid 1000 ip daddr 192.168.0.0/16 reject
      meta skuid 1000 ip daddr 10.0.0.0/8 reject
      meta skuid 1000 ip daddr 172.16.0.0/12 reject

      # Allow loopback for all
      oifname "lo" accept
    }
  }

Why UID-based rules: Process-level network isolation without containers, namespaces, or VMs. The kernel matches the UID of the process creating each outbound connection. An agent running as UID 1000 is blocked from TCP connections to any RFC 1918 address – all infrastructure access must flow through the broker’s Unix socket.

Limitations: UID-based nftables only works when agents are co-located on the broker host. Remote agents connecting via MCP over TCP are constrained solely by broker policy. The nftables rules provide defense- in-depth for the co-located deployment model, but the broker’s policy engine is the authoritative enforcement point for all agents.

11.2 Credential Injection Model #

Agents never see infrastructure credentials. The broker injects credentials into outbound requests on the agent’s behalf:

  Agent Request:                   Broker Sends:
  POST http://gitea/api/v1/repos   POST http://gitea/api/v1/repos
  (no auth headers)                 Authorization: token ghp_xxx...
                                    (credential from services.json)

Injection modes:

Auth Type	How Credentials Are Injected
`bearer`	`Authorization: Bearer {credential}` header
`basic`	HTTP Basic Auth header (base64-encoded `user:pass`)
`header`	Custom header with optional prefix (e.g., `X-API-Key: {prefix}{credential}`)
`query`	URL query parameter (e.g., `?api_key={credential}`)
`none`	No credentials injected (passthrough)

Header protection: Agent-supplied headers cannot override injected authentication headers. If a service uses bearer auth, the agent cannot set its own Authorization header – the broker’s injected value takes precedence. This prevents agents from redirecting credentials to unintended destinations.

Redaction: Credentials are masked in all API responses ("***"), audit logs, and dashboard displays. The only location where plaintext credentials exist is /var/lib/ephyr/services.json and the broker’s process memory.

11.3 CIDR Allow/Deny Policy #

The HTTP proxy enforces a network policy that controls which destinations agents can reach:

# /var/lib/ephyr/network_policy.json
{
  "allow_cidrs":    ["10.0.0.0/8", "172.16.0.0/12", "192.168.0.0/16"],
  "deny_cidrs":     ["192.168.10.0/24"],
  "external":       "deny",
  "external_allow": ["*.github.com"]
}

Evaluation order:

Deny CIDRs checked first – any match is an immediate reject
For private IPs (RFC 1918): must match at least one allow CIDR
For public IPs: governed by the external mode:
- deny – all public IPs blocked (default)
- restricted – only hostnames matching external_allow glob patterns permitted
- open – all destinations allowed (not recommended)

All resolved IPs must pass. When a hostname resolves to multiple IP addresses, every resolved IP is checked against the policy. A hostname that resolves to both a private and a public IP is rejected if the public IP fails the external policy.

11.4 DNS Resolution Security #

DNS resolution uses a 2-second timeout via Go’s net.DefaultResolver:

ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
addrs, err := net.DefaultResolver.LookupHost(ctx, hostname)

If the input is already an IP address (detected via net.ParseIP), DNS resolution is skipped entirely. This prevents DNS rebinding attacks where a hostname initially resolves to an allowed IP but later resolves to a restricted IP.

12. Hardening Guide #

12.1 Systemd Sandboxing Directives #

The following table summarizes all systemd security directives applied to the signer and broker services:

Directive	Signer	Broker	Effect
`ProtectSystem=strict`	Yes	Yes	Read-only filesystem
`ProtectHome=yes`	Yes	Yes	Home directories invisible
`NoNewPrivileges=yes`	Yes	Yes	No setuid escalation
`PrivateTmp=yes`	Yes	Yes	Isolated /tmp
`PrivateDevices=yes`	Yes	Yes	No /dev access
`ProtectKernelTunables=yes`	Yes	Yes	Read-only /proc/sys
`ProtectKernelModules=yes`	Yes	Yes	No module loading
`ProtectControlGroups=yes`	Yes	Yes	Read-only cgroups
`RestrictSUIDSGID=yes`	Yes	Yes	No SUID/SGID bits
`RestrictNamespaces=yes`	Yes	Yes	No namespace creation
`CapabilityBoundingSet=`	Yes	Yes	All capabilities dropped
`SystemCallFilter=@system-service`	Yes	Yes	Allowlisted syscalls
`SystemCallArchitectures=native`	Yes	Yes	Only native arch syscalls
`MemoryDenyWriteExecute=yes`	Yes	No	No W+X memory mappings
`RestrictAddressFamilies=AF_UNIX`	Yes	No	Unix sockets only
`RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6`	No	Yes	Unix + TCP
`ProtectHostname=yes`	Yes	No	Cannot change hostname
`RestrictRealtime=yes`	Yes	No	No RT scheduling
`LockPersonality=yes`	Yes	No	Cannot change exec domain
`ReadOnlyPaths=/etc/ephyr`	Yes	No	CA key read-only
`ReadWritePaths=/run/ephyr`	Yes	Yes	Socket directory
`ReadWritePaths=/var/log/ephyr`	No	Yes	Audit log
`ReadWritePaths=/var/lib/ephyr`	No	Yes	Configuration

12.2 Production Deployment Recommendations #

Ordered by impact:

Critical:

Deploy on a dedicated host. Run the signer and broker on a dedicated VM or LXC container with no other workloads. This ensures systemd sandboxing is the primary isolation layer, not one among many.
Enable host key verification. The default SSH client uses InsecureIgnoreHostKey. Until host key pinning is implemented in policy, ensure the network path between the broker and all targets is trusted (same VLAN, no untrusted L2 neighbors). This is the highest-priority item to address.
Enable TLS verification for services. The HTTP proxy uses InsecureSkipVerify: true. Deploy proper TLS certificates or a private CA for internal services. Place a TLS-terminating reverse proxy in front of the dashboard and MCP ports.

High:

Configure nftables for co-located agents. Apply UID-based rules to block direct agent-to-backend traffic.
Rotate MCP API keys regularly. Treat API keys as time-limited credentials. Rotate quarterly and immediately on suspected compromise.
Ship audit logs externally. Forward /var/log/ephyr/audit.json to a SIEM or append-only log store. This bounds the tamper window for audit manipulation.
Set external: "deny" in network policy. Never allow unrestricted public internet access from the proxy.

Standard:

Keep certificate TTLs short. The 5-minute default is appropriate. Resist the temptation to increase TTLs for convenience.
Use dedicated role accounts on targets. Configure agent-read (rbash, no sudo), agent-op (bash, limited sudo), and agent-admin (bash, scoped sudo). Never map agent principals to root.
Back up the CA key securely. Store an offline, encrypted backup. Loss requires re-provisioning all targets.

12.3 Target Host Hardening #

Ephyr provides a target provisioning script (deploy/scripts/provision-target.sh) that configures:

Role accounts:

Account	Shell	Sudo	Purpose
`agent-read`	`/usr/bin/rbash`	None	Read-only operations
`agent-op`	`/bin/bash`	Limited (systemctl status, docker ps/logs, monitoring)	Operational tasks
`agent-admin`	`/bin/bash`	Extended (systemctl start/stop, docker run/exec, apt list)	Administrative tasks

Sudoers deny list (all roles):

Shell interpreters (bash, sh, zsh, fish)
Text editors (vi, vim, nano, emacs)
Language interpreters (python, perl, ruby, node)
Package management mutations (apt install/remove, dpkg -i)
Dangerous commands (chattr, visudo, su, passwd, usermod, chmod, chown)

Sudoers file protection: The generated sudoers file is validated via visudo -c and then made immutable with chattr +i.

SSH configuration:

TrustedUserCAKeys points to the Ephyr CA public key
AuthorizedPrincipalsFile maps role accounts to SSH principals
sshd configuration is validated before applying

12.4 Monitoring and Alerting #

Ephyr exposes Prometheus-compatible metrics at GET /metrics:

Latency histograms (7 buckets from <100us to >50ms):

ephyr_token_sign_seconds – CTT-E token signing
ephyr_token_validate_seconds – CTT-E token validation
ephyr_watermark_check_seconds – Revocation check
ephyr_envelope_check_seconds – Capability envelope check
ephyr_policy_eval_seconds – Policy pipeline evaluation
ephyr_ssh_cert_seconds – SSH certificate signing (IPC)
ephyr_delegation_ipc_seconds – Delegation IPC
ephyr_exec_e2e_seconds – End-to-end exec latency

Counters:

ephyr_tasks_created_total, ephyr_tokens_signed_total, ephyr_tokens_validated_total, ephyr_tokens_rejected_total
ephyr_watermark_revocations_total, ephyr_delegation_rotations_total
ephyr_auth_cache_hits_total, ephyr_auth_cache_misses_total

Gauges:

ephyr_tasks_active, ephyr_active_watermarks
ephyr_delegation_cert_age_seconds, ephyr_delegation_certs_held

Recommended alerts:

ephyr_tokens_rejected_total increasing rapidly (brute-force attempt)
ephyr_delegation_cert_age_seconds > 3600 (delegation rotation failed)
ephyr_ssh_cert_seconds p99 > 1s (signer IPC latency degradation)
ephyr_auth_cache_misses_total spike (cache invalidation or new keys)

13. Comparison with Existing Solutions #

13.1 SPIFFE/SPIRE #

SPIFFE (Secure Production Identity Framework for Everyone) and its reference implementation SPIRE provide workload identity through X.509 SVIDs (SPIFFE Verifiable Identity Documents).

Aspect	SPIFFE/SPIRE	Ephyr
Identity model	X.509 SVIDs with SPIFFE IDs (URIs)	SSH certificates with principal-based roles
Agent identity	Workload attestation (k8s, AWS, process)	SO_PEERCRED UID verification (Linux)
Certificate format	X.509	OpenSSH user certificates
Signing architecture	Server with upstream CAs, nested topology	Two-process (signer + broker), single CA
Token support	JWT-SVIDs	CTT-E (EdDSA JWT with delegation chain)
Policy model	Registration entries + RBAC	YAML policy with 8-step evaluation pipeline
Target integration	mTLS between workloads	SSH `TrustedUserCAKeys` on target hosts
Operational complexity	High (control plane, agents, registration API)	Low (two systemd services, one YAML file)
AI agent focus	General-purpose workload identity	Purpose-built for AI agent access patterns
HTTP proxy	No built-in credential injection	Built-in proxy with 5 auth types
MCP integration	None	Native MCP server with 10+ tools

When to choose SPIFFE/SPIRE: Large-scale, multi-cluster environments where workload-to-workload mTLS is the primary requirement. Kubernetes-native environments with existing SPIRE infrastructure.

When to choose Ephyr: AI agent infrastructure where SSH access, HTTP credential injection, and MCP tool federation are the primary requirements. Environments where operational simplicity and minimal dependencies are valued.

13.2 HashiCorp Vault SSH Secrets Engine #

Vault’s SSH secrets engine can operate in signed certificate mode, issuing short-lived SSH certificates from a CA key stored in Vault.

Aspect	Vault SSH Engine	Ephyr
CA key storage	Vault’s encrypted storage backend (Shamir/auto-unseal)	Dedicated signer process with systemd sandbox
Certificate issuance	REST API with Vault token	MCP tool call or Unix socket API
Policy model	Vault policies + SSH role configuration	YAML policy with 8-step pipeline + RBAC
Credential injection	Vault Agent sidecar can inject secrets	Built-in HTTP proxy with transparent injection
Agent identity	Vault auth methods (AppRole, k8s, etc.)	SO_PEERCRED or bcrypt API keys
Task correlation	Via Vault audit log entity/alias	ULID task IDs with hierarchical lineage
Revocation	Vault lease revocation (per-credential)	Epoch watermark (cascading, O(depth))
Operational overhead	Vault cluster (HA requires 3+ nodes)	Two systemd services on one host
Dependencies	Vault server + storage backend + auth backend	Go binary + 3 libraries
AI agent integration	REST API (no MCP support)	Native MCP server

When to choose Vault: Organizations already running Vault for secrets management, where SSH certificate issuance is one of many secret types. Environments requiring HSM-backed key storage (Vault supports PKCS#11).

When to choose Ephyr: Dedicated AI agent access broker where MCP integration, task-scoped audit, and credential injection are critical. Environments that value minimal operational overhead.

13.3 Traditional PAM and SSH Key Management #

Aspect	Traditional SSH Keys	Ephyr
Key lifecycle	Permanent (until manually rotated)	Ephemeral (5-minute default TTL)
Credential storage	`.ssh/authorized_keys` on each host	CA public key on hosts; no per-agent keys
Access revocation	Remove key from each host (O(hosts))	Policy change or watermark (O(1))
Audit	sshd log (IP + user)	Structured JSON with task ID, agent, role
Blast radius	Full host access for key lifetime	Scoped to role + target + TTL
Agent identity	SSH key fingerprint	UID-verified identity with task lineage
Scaling	O(agents x hosts) key distribution	O(hosts) CA key distribution
Credential rotation	Manual, error-prone	Automatic (ephemeral per-request)

13.4 Summary Matrix #

                    |  SPIFFE/SPIRE  |  Vault SSH  |  PAM/Keys  |  Ephyr
  ──────────────────|────────────────|─────────────|────────────|──────────
  AI agent focus    |      No        |     No      |     No     |   Yes
  MCP integration   |      No        |     No      |     No     |   Yes
  HTTP proxy        |      No        |    Partial  |     No     |   Yes
  Ephemeral certs   |     Yes        |     Yes     |     No     |   Yes
  Task lineage      |      No        |     No      |     No     |   Yes
  Cascading revoke  |      No        |     No      |     No     |   Yes
  Credential inject |      No        |    Partial  |     No     |   Yes
  Operational cost  |     High       |    Medium   |     Low    |   Low
  Dependencies      |     High       |    Medium   |     None   |  Minimal

14. Known Limitations and Future Work #

14.1 Current Limitations #

Single-tenant architecture. Ephyr is designed for single-tenant deployment. All agents, targets, and services are governed by one policy file and one CA key. Multi-tenant environments must deploy separate Ephyr instances. There is no built-in mechanism for cross-instance trust or delegation.

No external IdP federation. Agent identity is established via SO_PEERCRED (UID) or bcrypt API keys. There is no integration with OIDC providers, SAML IdPs, or LDAP directories. Environments requiring centralized identity management must provision agent UIDs and API keys out-of-band.

No X.509 interop. Ephyr issues OpenSSH certificates exclusively. It cannot issue X.509 client certificates for mTLS workflows. Workloads requiring X.509 identity should use SPIFFE/SPIRE or a traditional PKI alongside Ephyr.

SSH host key verification disabled. The broker’s SSH client uses InsecureIgnoreHostKey, making SSH connections vulnerable to man-in-the-middle attacks. This is the most significant known security gap (see Threat T6 in the threat model).

TLS certificate verification disabled. The HTTP proxy and MCP federation client use InsecureSkipVerify: true, making HTTPS connections vulnerable to interception (see Threat T7).

Credentials stored in plaintext. Service credentials (/var/lib/ephyr/services.json) and federated MCP server credentials (/var/lib/ephyr/remotes.json) are stored as plaintext JSON. File permissions (0600) are the sole protection.

Dashboard token is static. The dashboard uses a single static token with no automatic rotation, session expiry, or concurrent session limits.

No push revocation for SSH certificates. OpenSSH does not support online CRL checking for user certificates. Revocation is effective only at the broker level – a certificate that has already been delivered remains valid on the target until its TTL expires.

MCP API keys are bearer tokens. No secondary authentication factor (mTLS, OIDC) is supported on the MCP TCP endpoint. A stolen API key grants full access for the impersonated agent’s policy scope.

14.2 Planned Improvements #

Improvement	Addresses	Priority
SSH host key pinning in policy.yaml	T6 (SSH MITM)	Critical
Per-service TLS CA configuration	T7 (HTTPS MITM)	High
Credential encryption at rest	T10 (plaintext creds)	High
mTLS client certificates for MCP	T1 (API key theft)	Medium
OIDC/JWT authentication option	T1 (API key theft)	Medium
Remote audit log shipping	T11 (log tampering)	Medium
Log entry signing (HMAC/Ed25519)	T11 (log tampering)	Medium
WebSocket message-based auth	T9 (dashboard token leak)	Medium
Per-agent rate limiting on MCP TCP	T12 (DoS)	Medium
CTT-D delegation tokens (Phase 2b)	Subtask delegation	Low
`RevokedKeys` file distribution	T5 (target-side revocation)	Low

14.3 Research Directions #

Confidential computing. Running the signer in a TEE (Trusted Execution Environment) such as AMD SEV-SNP or Intel TDX would protect the CA key even against root-level compromise of the host OS. This would address the residual risk in Threat T4 (signer compromise via host root access).

Post-quantum transition. While not urgent for short-lived tokens, a migration path to ML-DSA (FIPS 204) or SLH-DSA (FIPS 205) should be planned for the CA key, which is long-lived and could be targeted by harvest-now-decrypt-later attacks against the signed certificates.

Formal verification. The policy engine’s 8-step pipeline and the token validation chain are amenable to formal verification using tools like TLA+ or Alloy. This would provide mathematical guarantees about the absence of policy bypass paths.

Hardware-backed key storage. Integration with TPM 2.0 or PKCS#11 HSMs for the root CA key would eliminate the file-based key storage risk entirely. The signer could load the key from a hardware token at startup without it ever existing in a regular file.

15. Appendix A: Trust Boundary Diagram #

 ┌───────────────────────────────────────────────────────────────────────────┐
 │  EPHYR HOST (dedicated VM or LXC container)                             │
 │                                                                           │
 │  ┌─────────────────┐                                                     │
 │  │  SIGNER          │   Trust Boundary B1                                │
 │  │  (UID: signer)   │   ───────────────────                              │
 │  │                   │                                                    │
 │  │  - Ed25519 CA key │   Protections:                                    │
 │  │  - AF_UNIX ONLY   │   - SO_PEERCRED (broker UID only)                │
 │  │  - No TCP/UDP/raw │   - MemoryDenyWriteExecute                       │
 │  │  - CapBound=empty │   - SystemCallFilter=@system-service             │
 │  │  - W^X enforced   │   - ProtectSystem=strict                         │
 │  │                   │   - RestrictAddressFamilies=AF_UNIX              │
 │  └────────┬──────────┘                                                   │
 │           │ /run/ephyr/signer.sock                                      │
 │           │ (one-shot JSON, UID-verified)                                │
 │  ┌────────┴──────────────────────────────────────────────────────────┐   │
 │  │  BROKER (UID: 999)                Trust Boundary B2               │   │
 │  │  ──────────────────────────────────────────────────────           │   │
 │  │                                                                   │   │
 │  │  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐            │   │
 │  │  │ Policy   │ │ Session  │ │  Token   │ │ Audit    │            │   │
 │  │  │ Engine   │ │ Manager  │ │  Issuer  │ │ Logger   │            │   │
 │  │  └──────────┘ └──────────┘ └──────────┘ └──────────┘            │   │
 │  │  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐            │   │
 │  │  │ Exec     │ │ Proxy    │ │ Revoc.   │ │ Grant    │            │   │
 │  │  │ Pool     │ │ Engine   │ │ Map      │ │ Store    │            │   │
 │  │  └──────────┘ └──────────┘ └──────────┘ └──────────┘            │   │
 │  │  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐            │   │
 │  │  │ MCP      │ │ Deleg.   │ │ Task     │ │ Activity │            │   │
 │  │  │ Server   │ │ Manager  │ │ Manager  │ │ Ring Buf │            │   │
 │  │  └──────────┘ └──────────┘ └──────────┘ └──────────┘            │   │
 │  │                                                                   │   │
 │  │  Listeners:                                                       │   │
 │  │    /run/ephyr/broker.sock (Unix, SO_PEERCRED) ─── B0: Agent CLI │   │
 │  │    :8553 (TCP, token auth) ─────────────────────── B5: Dashboard  │   │
 │  │    :8554 (TCP, bcrypt API key) ─────────────────── B0r: MCP      │   │
 │  └──────┬──────────────┬──────────────┬──────────────────────────────┘   │
 │         │              │              │                                   │
 └─────────│──────────────│──────────────│───────────────────────────────────┘
           │              │              │
      ┌────┴────┐   ┌────┴────┐   ┌────┴──────┐
      │  SSH    │   │  HTTP   │   │  Remote   │
      │  Targets│   │Services │   │  MCP      │
      │  (B3)   │   │  (B4)   │   │ Servers   │
      │         │   │         │   │  (B5)     │
      └─────────┘   └─────────┘   └───────────┘

Boundary descriptions:

B0: Agent CLI to Broker – Unix socket, SO_PEERCRED UID verification
B0r: Remote Agent to Broker – TCP :8554, bcrypt API key
B1: Broker to Signer – Unix socket, SO_PEERCRED UID restriction
B2: Broker internal – all components share the broker process
B3: Broker to Target Hosts – SSH with ephemeral certificates
B4: Broker to HTTP Services – HTTP/HTTPS with credential injection
B5: Broker to Remote MCP / Dashboard to Admin – TCP with auth

16. Appendix B: Threat Enumeration Summary #

ID	Threat	Severity	Boundary	Mitigation Status
T1	Agent credential theft (API key)	High	B0r	Mitigated (bcrypt); residual (bearer token, no 2FA)
T2	Agent session hijacking	Medium	B0	Mitigated (ownership check per operation)
T3	Broker compromise	Critical	B2	Mitigated (CA key isolated in signer); residual (service creds exposed)
T4	Signer compromise (CA key theft)	Critical	B1	Mitigated (systemd sandbox, no network); residual (root host access)
T5	Target compromise via active session	Medium	B3	Mitigated (5-min TTL); residual (no push revocation)
T6	SSH man-in-the-middle	Critical	B3	OPEN – `InsecureIgnoreHostKey`
T7	HTTPS man-in-the-middle	High	B4/B5	OPEN – `InsecureSkipVerify: true`
T8	Network bypass (agent direct access)	Medium	B0	Mitigated (nftables UID rules for co-located agents)
T9	Dashboard token leakage	Medium	B5	Partially mitigated (constant-time compare, privacy mode)
T10	Credential exposure at rest	High	–	Partially mitigated (file permissions); residual (plaintext JSON)
T11	Audit log tampering	Medium	–	Partially mitigated (append-only, WebSocket secondary); residual (no signing)
T12	Denial of service	Medium	B0r	Partially mitigated (rate limiting, bcrypt cost); residual (no per-agent MCP limits)

17. Appendix C: Dependency Analysis #

Ephyr maintains a minimal dependency footprint. The entire project (22,832 lines of Go) depends on only three external libraries:

module github.com/ben-spanswick/ephyr

go 1.24.1

require (
    github.com/gorilla/websocket v1.5.3
    golang.org/x/crypto v0.48.0
    gopkg.in/yaml.v3 v3.0.1
)

require golang.org/x/sys v0.41.0 // indirect

Dependency	Purpose	Risk Assessment
`gorilla/websocket`	Dashboard WebSocket event hub and terminal	Mature, widely-used library. WebSocket-only; no HTTP routing or middleware attack surface.
`golang.org/x/crypto`	bcrypt (API key hashing), SSH (certificate signing, client)	Official Go extended library. Maintained by the Go team. Used for all cryptographic operations.
`gopkg.in/yaml.v3`	Policy file parsing	Mature YAML parser. Used only at startup and SIGHUP reload. Not on the hot path.
`golang.org/x/sys`	Indirect dependency of x/crypto (syscall support)	Official Go extended library.

No dependency on:

HTTP routing frameworks (standard library net/http used directly)
JSON libraries (standard library encoding/json)
Logging frameworks (standard library log)
Database drivers (all state is in-memory or flat files)
Container runtimes or orchestrators
Cloud provider SDKs

This minimal dependency surface significantly reduces supply chain risk. The project can be audited by reading approximately 23,000 lines of Go plus the four dependencies listed above.

Document Information #

Field	Value
Project	Ephyr – Secure Agent Access Broker
Version	0.2
Codebase	~22,832 lines of Go
License	Apache 2.0
Repository	https://github.com/ben-spanswick/ephyr
Last Updated	March 2026
Classification	Public

This document describes the security architecture of Ephyr as implemented in version 0.2. It is intended as a reference for security teams evaluating the project. For deployment instructions, see docs/deployment.md. For API reference, see docs/api-reference.md. For the full threat model, see docs/THREAT_MODEL.md.