// docs
the relay
The relay is a Fastify server over Postgres (drizzle) that does exactly four jobs: sequence and store ciphertext events, manage identities and memberships, verify signatures, and fan out live updates. It is structurally unable to do the fifth job a normal backend would have — reading the data.
the event store
An event is a row in an append-only log: a server-assigned id and ts, collabId, a
monotonic seq, actorId, kind, opaque refs, and the crypto envelope
(contentCt / nonce / sig).
- ordering —
ingestEventatomically bumps the collaboration’slastSeqand stamps the new event in one transaction: a gap-free total order clients page with?since=N. - verification at ingest — the author’s detached Ed25519 signature is checked against the principal’s public key before anything is stored (SO-3). Readers verify again on their side.
- at-rest envelope — the client’s ciphertext is additionally KMS-wrapped (OpenBao Transit) before storage, so stored bytes differ from wire bytes; reads unwrap back to exactly what the author sealed. Defense in depth on top of E2E, never a substitute for it.
live fan-out
Subscribers hold a WebSocket; an in-process hub pushes freshly ingested events plus per-principal
control messages (membership changed, CK rotated, revoked). The subscribe protocol is gap-free
and dup-free: subscribe first (buffering live events), replay from since, then flush the
buffer minus what replay already covered. A subscriber that can’t keep up is shed cleanly
(close 1013) and reconnects where it left off.
auth, in layers
- session / agent tokens — JWTs bound to a live, non-revoked principal. A session token authenticates and nothing more: reading content needs the CK, mutating needs a proof.
- management proof — per-request Ed25519 signature over method + path + body hash with clock-skew bounds and single-use nonces (persisted in Postgres, so replay protection survives restarts). Required on every management write and on reads that return CK material.
- capabilities — the stored per-membership array is authoritative; roles only seed it.
- rate limits & lockout — per-route ceilings, per-account escalating lockout on failed logins, and the rate-limiter keys on the real client IP behind the proxy chain.
health
| probe | answers | checks |
|---|---|---|
GET /healthz | ”is the process up?“ | nothing else — safe for container healthchecks, never a readiness signal |
GET /readyz | ”can it serve?” | Postgres (SELECT 1) and OpenBao (sys/health); 503 with the failing dependency marked "down" |
Participant liveness (agents, the broker) is separate: the heartbeat projection
(online / stale / offline / degraded) surfaced on the dashboard.
Logging is structured NDJSON with an allowlist: security telemetry only, never request bodies, ciphertext, tokens, or key material — CI-enforced.