If you’ve used a database, you already know the word “log” — the transaction log, the WAL (Write-Ahead Log). This chapter has exactly one goal: to plant a shift in perspective. In a database the log is a supporting actor that backs up the main table; in Quipu-Log the log is the lead — it is the database.
Quipu-Log never edits data. It only appends to the end of a file. That one rule buys crash safety, tamper-evidence, and trivial backups almost for free.
First, what you already know: the DB’s WAL
When a relational database runs UPDATE accounts SET balance = 100 WHERE id = 42, it actually writes in two places.
- The WAL (the log) — it first appends the intent (“set account 42’s balance to 100”) sequentially to the end of a file. This is fast (just appending to the end).
- The data pages (the main body) — it then finds the B-tree page where the actual table lives and overwrites the value in place. This touches scattered spots on disk and is slow, so it happens later, lazily.
The reason the WAL comes first (Write-Ahead) is crashes. If the power dies mid-overwrite, the “intent” still sits in the WAL, so after reboot the database can replay it to recover. The WAL is the source of truth; the data pages are closer to a cache that lets you read that truth quickly.
Think of a bank passbook. Transactions are only ever appended, line by line — you never erase or edit a line that’s already printed. The “current balance” is shown on the bottom line, but that’s really just the sum of every transaction. The transaction history (the log) is the original; the balance (the current state) is a summary of it.
The shift: what if we drop the main body entirely?
Here is Quipu-Log’s decisive choice. “What if we don’t build data pages (the main body) at all, and keep only the WAL?”
An audit log exists to record things that happened. Things that happened don’t change later — “Alice deleted the document yesterday” is true forever. So there’s never a value to overwrite. And if there’s nothing to overwrite, you don’t need the heavy B-tree body that exists to support overwriting.
In a DB, the WAL is a helper for the main body (the tables). In Quipu-Log, that single WAL is the database. The equation “log = data” is the first pillar holding up this entire library.
How: appending frames to the end of a file
So what actually piles up in the file? Here is the storage engine’s own module description. Quipu-Log wraps every record in a fixed frame and appends it to the end of the file.
crates/quipu-core/src/storage/mod.rs — module docs// A segment file starts with a header (ALOG magic + format version + base index),
// and every record is wrapped in this frame:
[u32 LE payload length][u32 crc32(payload)][u64 timestamp][payload]
// ↑ body length ↑ body checksum ↑ record time ↑ the data
Take it field by field and you can see why these four pieces sit in this order.
payload length— how many bytes the body is. When reading, it tells you “where does this record end?” It’s the signpost to the start of the next record.crc32(payload)— a checksum (a short fingerprint) of the body. If the disk corrupted a bit, or a write was left half-finished, the value you recompute on read won’t match — and you know “this record is damaged.” Ch. 9 covers thistimestamp— when it was recorded. The key point is it lives in the frame header, not the body. That way you can skim just the times — without deserializing the body — to decide things like “this file is March’s.” Used by retention, Ch. 17payload— the actual audit record. Ch. 10, serialization
And all of this happens with only std::fs (Rust’s standard file API). No special OS features, no DB engine. Which is why the module docs proudly note:
// Only std::fs / std::io are used, so behaviour is identical on every OS Rust targets.
Why append-only — a perfect match for audit logs
The append-only rule looks simple, but it hands you the properties an audit log wants, one after another, for free.
| Because it’s append-only… | You get |
|---|---|
| bytes already written are never touched | Tamper-evidence becomes easy. Editing the past is not a “normal operation,” so traces of an edit are themselves evidence of an incident (Part 5). |
| a write is always “append to the end” | It’s fast. Sequential writes — the disk head doesn’t jump around. |
| old data never changes | Backups are safe. You can copy the old files even while the daemon runs (no lock worries). |
| only the last record can be incomplete | Crash recovery is simple. Just trim the one broken piece at the end of the file (Ch. 12). |
The price of dropping the body (the B-tree index) is reads. To find “records about document 42” you’d in principle have to scan the log. Quipu-Log solves this with secondary indexes — how that’s possible is in Ch. 15 (indexing) and Ch. 16 (query execution). For now, just remember: “writes are simple and fast; reads need extra machinery.”
“The code doesn’t even support overwriting” is a strong security property — there’s no path for a bug to corrupt past records. But this is only an application-level promise; an attacker with direct disk access can still edit the file. Detecting that is the Merkle tree’s job (Part 5) — append-only doesn’t prevent tampering, it makes tampering clearly abnormal.
Recap
- A DB keeps both the log (original) and the body (fast reads).
- Quipu-Log keeps only the log. Audit records are never edited, so the body is unnecessary.
- Records are appended only to the end as
[length][CRC][time][body]frames — with nothing butstd::fs. - Append-only gives fast writes, easy backups, simple recovery, and tamper-evidence friendliness. The cost is that reads need a separate index.
① Explain “the WAL is the database” in one sentence, using the passbook analogy.
② Why does Quipu-Log not build B-tree data pages? (Hint: one essential property of audit records.)
③ Why is the timestamp in the frame header rather than inside the body?