The Quipu-Log Book
Part 4 · Re-creating what a DB gave you for free, on files

17 · Deletion and retention: segment unlink vs DELETE

You can't keep audit logs forever. When disk fills up, old entries have to go. In a DB you'd run DELETE FROM logs WHERE ts < cutoff and later vacuum to reclaim the space. Quipu-Log doesn't work that way. Instead it unlinks entire segment files. Let's look at why that approach is dramatically simpler and faster, and at the reason behind the choice to never delete the registry.

In one sentence

The retention policy unlinks old sealed segments whole — no row-level deletion or rewriting, the registry is preserved, and the active segment is never touched.

What you already know: DELETE + VACUUM vs. partition drop

In a relational DB, "deleting old rows" is actually two steps. DELETE FROM logs WHERE ts < cutoff only marks rows as dead tuples — it doesn't reclaim disk space immediately. Later, VACUUM cleans up dead tuples, and VACUUM FULL rewrites the table file itself to return the space. It's an O(n) operation proportional to table size, and the table is locked while it happens.

The smarter approach is a partition drop. If you set up monthly partitions like CREATE TABLE logs_2024_01 PARTITION OF logs, dropping January means just DROP TABLE logs_2024_01 — one line, O(1). That sets the file's inode reference count to zero and the OS reclaims the space immediately.

DB ↔ Filesystem

DB DELETE + VACUUM: mark rows dead → separate vacuum process cleans up → slow and heavy. DB partition drop: return the whole table file to the OS → O(1), instant. Quipu-Log segment unlink: same as a partition drop. Each segment file is its own partition. One call to std::fs::remove_file() is the equivalent of DROP TABLE.

RetentionPolicy: age and size, two axes

The retention policy is declared as a RetentionPolicy.

crates/quipu-core/src/retention.rspub struct RetentionPolicy {
    pub max_age:   Option<Duration>, // "delete anything older than this"
    pub max_bytes: Option<u64>,     // "delete oldest when size exceeds this"
}

impl RetentionPolicy {
    pub fn days(days: u64) -> Self { ... }
    pub const fn with_max_bytes(mut self, n: u64) -> Self { ... }
}

The two conditions combine with OR. The moment either is exceeded, the oldest sealed segment is deleted first. With 90 days + 50 GB: "delete if anything is older than 90 days, and also delete if total size exceeds 50 GB."

How deletion works: the O(1) secret of unlink

The actual deletion is handled by Table::purge_older_than() and Table::purge_oldest_sealed().

crates/quipu-core/src/storage/table.rspub fn purge_older_than(&mut self, cutoff_micros: u64) -> Result<usize> {
    let doomed: Vec<u64> = self.sealed.iter()
        .filter(|(_, s)| s.meta.max_timestamp < cutoff_micros) // even the newest row is expired
        .map(|(&seq, _)| seq)
        .collect();
    for seq in &doomed {
        if let Some(s) = self.sealed.remove(seq) {
            std::fs::remove_file(s.path)?;            // file unlink — O(1)
            let _ = std::fs::remove_file(meta_path(&self.dir, *seq));
        }
    }
    Ok(doomed.len())
}

remove_file() calls the Unix unlink(2) system call. It doesn't touch the file's contents — it only removes the directory entry (the name → inode link). Once the inode's reference count drops to zero, the OS immediately returns the disk blocks. Whether the file is 64 MB or 4 GB, the time taken is the same: genuinely O(1).

Analogy

Think of a librarian who, rather than tearing out pages one by one, places a whole book onto the returns cart. There's no need to read a single page — it's done in an instant.

DB: DELETE + VACUUM table (heap) dead dead live live garbage live dead live ① DELETE: marks dead only (no space reclaimed) ② VACUUM: cleans dead rows (takes time) ③ VACUUM FULL: rewrites table (O(n), locking) cost: O(deleted rows + live row rearrangement) side effects: blocks concurrent reads, space fragmentation integrity: rewriting live rows → Merkle hashes must be recomputed Quipu-Log: segment unlink segment files seg-0 max_ts expired seg-1 max_ts expired seg-2 active remove_file() ✓ O(1) — independent of file size ✓ surviving records are never touched ✓ Merkle spine preserved (incl. deleted prefix) seg-2 (active) is never dropped registry tables are preserved indefinitely
DB DELETE+VACUUM is O(n) and requires locking. Quipu-Log segment unlink is O(1) regardless of file size, and never touches surviving records.

Two absolute rules: preserve the active segment, preserve the registry

Quipu-Log's retention has two exceptions. Neither is ever deleted, regardless of policy settings.

1. The active segment is never deleted

The code only ever targets sealed segments. The active segment — the one currently being appended to — is not yet closed, so it is simply not a candidate for retention decisions.

As a result, max_bytes is a target, not a hard ceiling. However large the active segment grows, it cannot be deleted, so the store can exceed the configured value by up to one active segment's worth. Once that segment seals at the next rollover, the next retention run will drop it.

crates/quipu-core/src/retention.rs — explanatory comment// Enforcement drops whole sealed segments, so purging never rewrites data
// and costs one unlink per segment.
//
// The active segment is never dropped, so max_bytes is a *target*, not a
// hard ceiling: the store can exceed it by up to one active segment per
// table until the next roll.

2. The registry is never deleted

apply_retention() only purges the logs and relations tables. The registry (registry/<type>/) is untouched. A code comment spells out the reason.

crates/quipu-core/src/retention.rs — explanatory comment// Registries (and their meta/checkpoint bookkeeping) are intentionally not
// purged and not counted against max_bytes:
// version history is what lets old logs keep rendering as-recorded values.

For a log record from 90 days ago to display "the user's name at the time," the registry version from that time must still be alive. Even if Alice later changed her name to "Alicia," a 90-day-old log should still render as "Alice." Delete the registry and there is no way to recover the actor/target information in past logs.

In practice, registry records are proportional to the number of entities. 100,000 users at an average of two versions each is 200,000 records — negligible compared to tens of millions of log records.

Why this design?

Without a preserved registry, "renderability of past logs" has to be tied to the log retention period — a significant increase in design complexity. The fact that the registry stays small is precisely what makes "preserve the registry forever" a practical rule rather than a burden.

Re-anchoring the checkpoint after purge

One more thing happens after a purge. The previous checkpoint may reference a Merkle root that covered records in the deleted segments. That checkpoint is no longer in a verifiable state. So a new checkpoint is issued automatically right after retention runs.

crates/quipu-core/src/store.rspub fn apply_retention(&mut self) -> Result<usize> {
    // ... purge_older_than(), purge_to_byte_budget() ...
    if dropped_main > 0 {
        // re-anchor after the unlink: a fresh checkpoint covers the surviving records
        self.write_checkpoint()?;
    }
    Ok(dropped)
}

The Merkle spine (the list of leaf hashes) is not affected by retention — leaf hashes for deleted segments remain in the spine, allowing proof that "those records once existed." The details of this are covered in Ch. 20 (Merkle History Tree).

Recap

  • DB DELETE+VACUUM deletes rows one by one and reclaims space separately — O(n), heavy.
  • Like a DB partition drop, Quipu-Log unlinks entire segment files — O(1), surviving records untouched.
  • RetentionPolicy combines age (max_age) and size (max_bytes) with OR; when a condition is met, the oldest sealed segment is removed first.
  • The active segment is never deleted → max_bytes is a target, not a hard ceiling.
  • The registry is never deleted → past logs always render with the values that were current when they were recorded.
  • A new checkpoint is issued automatically after a purge to keep the integrity-verification baseline current.
Check yourself

① Explain why segment unlink is faster than row-level DELETE from an OS filesystem perspective. (Hint: inode reference count.)
② With RetentionPolicy::days(90).with_max_bytes(50 * 1024 * 1024 * 1024), can a segment that is only 70 days old be deleted due to a size overrun?
③ Explain why the registry is never deleted using "what data is needed to display an actor's name from a 90-day-old log?" as your starting point.