35 · Server, client, MCP, and observability

This is the book's final chapter. We take one last look at the layer where everything you've learned — the storage engine, integrity, confidentiality, the pipeline — actually runs as a live service. Server mode, the client, the MCP auditor, and operational observability: a high-level survey of each, highlighting only the key decisions.

Server mode: a single store over HTTP/JSON

Embedded mode plants the audit log directly inside your process. Server mode wraps that same engine in an HTTP daemon, so services written in any language can send and query audit records with a single bearer token.

quipu-server lets services in any language connect to a single audit daemon over HTTP. The store is held exclusively by a single process via a file lock.

A few key design points about the server.

Token authentication and RBAC. Every endpoint requires Authorization: Bearer <token> (except healthz). Tokens are stored as SHA-256 hashes in the config file, so the config file itself is never a credential. Each token is assigned a role, and each role is granted actions (emit · query · administer). Deny-by-default — anything not explicitly permitted is refused. Tokens can be hot-reloaded with zero downtime via SIGHUP.

Write-only deployment. As covered in detail in Ch. 27 write-only deployment, the server can start without an RSA private key. In that mode, RSA-protected fields are returned as undecryptable ciphertext — even if the server is compromised, plaintext is never exposed.

Query concurrency cap. Queries are full scans, so concurrent queries can monopolize the CPU. auth.max_concurrent_queries limits the number of simultaneous queries per token; exceeding the limit is rejected immediately with 429.

NDJSON export. POST /v1/logs/export takes the same LogQuery and responds with application/x-ndjson — intended for SIEM pipelines and auditor handoffs.

MCP: the LLM as auditor

quipu-mcp is a Model Context Protocol (MCP) server — it lets an LLM agent query and verify audit logs in natural language. Questions like "What documents did Alice delete last week?" can be answered by the agent directly from the log.

Why this design?

Making the MCP server an embedded store would create a file lock conflict with an already-running quipu-server — by design, there is only one writer. So quipu-mcp is an HTTP client, not an embedded store. That choice lets it reuse quipu-server's RBAC, query concurrency cap, key boundaries, and meta-auditing entirely for free. MCP adds zero new security surface.

Three tools are exposed to the agent.

Tool	What it does	Required permission
`query_logs`	Search audit logs with LogQuery	query
`get_entity_history`	Full version history of an entity	query
`verify_store_integrity`	Run chain integrity verification	administer

Tool failures are returned as a normal result with isError: true rather than as a JSON-RPC protocol error. This lets the agent read the error as text and adjust its reasoning — the session stays alive.

And an important property: the agent's own queries are recorded in the meta-audit log. Every tool call goes through quipu-server's query path, so when access_log is enabled, "when the AI queried what" is captured automatically. This is a free property the MCP crate gets without any special handling.

crates/quipu-mcp/src/lib.rs — core design decision// The agent talks MCP to this server; this server talks the
// ordinary token-authenticated HTTP API to quipu-server.
// Access is audited for free. Every tool call is an HTTP query
// against the server, so the agent's own reads land in the ledger
// through the same path — without this crate doing anything special.

Observability: you can't operate what you can't see

A server going down, a disk filling up, events piling into the DLQ — you need to know about these things in time to act. Quipu-Log supports this with Prometheus, healthz, and syslog.

Prometheus: GET /metrics

Metrics are served as lock-free atomic snapshots on a path separate from the writer thread. Even if the writer dies, /metrics stays alive — the most important information available at the most critical moment.

crates/quipu-middleware/src/metrics.rs — lock-free histogrampub struct PipelineMetrics {
    queue_depth: AtomicI64,
    writes_ok: AtomicU64,
    writes_parked: AtomicU64,   // events sent to DLQ
    events_lost: AtomicU64,     // store + DLQ both failed
    latency_buckets: [AtomicU64; LATENCY_BUCKETS_SECS.len()], // histogram
    latency_sum_micros: AtomicU64,
}

Key metrics and recommended alerts.

Metric	Type	Alert threshold
`quipu_writer_alive`	gauge	page immediately if == 0
`quipu_disk_full`	gauge	page immediately if == 1
`quipu_events_lost_total`	counter	page immediately if rate > 0
`quipu_dlq_entries`	gauge	warn if > 0 persists for several minutes
`quipu_disk_low`	gauge	warn if == 1
`quipu_write_latency_seconds`	histogram	monitor for rising p99 trend

GET /v1/healthz: three states

healthz returns JSON: {"status": "ok"/"degraded"/"unhealthy", "reasons": [...]}.

unhealthy (HTTP 503) — writer thread is dead, ENOSPC latch is set, or the store root probe failed. Events are not being persisted.
degraded (HTTP 200) — writes are working but disk free space is below the threshold. The 200 is the point — a load balancer must not pull this instance. Alerts should fire on the status field, not the status code.
ok (HTTP 200) — everything is normal.

crates/quipu-middleware/src/health.rs — HealthStatepub struct HealthState {
    writer_alive: AtomicBool,
    disk_full: AtomicBool,    // ENOSPC latch — auto-clears on successful write
    low_disk: AtomicBool,     // early warning — below disk threshold
    // all atomic — healthz remains readable even if the writer dies
}

syslog / SIEM mirror

Add "sink": {"syslog_udp": "10.0.0.5:514"} to the config and every audited event is mirrored to the SIEM as RFC 5424 syslog over UDP. The mirror is best-effort — if the backlog fills, lines are dropped and the write path is never blocked. The store is the system of record; the SIEM holds a skeleton copy.

Security note

GET /metrics requires an administer token. Metrics contain internal operational details — queue depth, DLQ state, and so on — that could become an attack surface if exposed publicly. Make sure your Prometheus scrape configuration always includes the bearer token.

Closing the book: what it means to build a database on files

This book started with a single question: "How do you hand-build everything a database gives you for free, on top of plain files?" We built a WAL out of an append-only file, detected torn writes with CRC, tuned durability with fsync policy, and enforced a single writer with a file lock. We built indexes directly in memory and made reads non-blocking with snapshots. We layered Merkle trees on top for tamper-evidence, and searched without plaintext using AEAD and blind indexes. And at the end, we scaled this simple single-writer design — without making it complex — using client-side spooling and sharding. At every layer, one principle held without exception: keep it simple, and don't add complexity where you don't have to. That is the deepest lesson Quipu-Log — and every storage engine built on a filesystem — has to teach.

Check yourself

① Why is quipu-mcp an HTTP client rather than an embedded store? How does that connect to the single-writer design?
② Why does /v1/healthz return HTTP 200 in the "degraded" state? What operational problem would returning 503 cause?
③ Looking back at the whole book, pick three examples where something a database "gives you for free" was instead built directly on files, and explain each one.