09 · Record framing: length, CRC32, magic/version

When you string records one after another into a file, one problem immediately surfaces — "where does one record end and the next begin?" Looking at raw bytes, there are no boundaries. On top of that, disks occasionally corrupt bits silently. This chapter explains how Quipu-Log solves the boundary problem with frames and the silent-corruption problem with CRC32.

In one sentence

Each record is wrapped in a [u32 length][u32 CRC32][u64 timestamp][payload] frame. CRC catches accidental corruption only — detecting intentional tampering is the Merkle tree's job (Part 5).

What you already know: DB pages and checksums

Think about a database like PostgreSQL. Data is stored in 8 KB pages. Each page header includes a checksum that's recomputed on every read and compared against the stored value. A mismatch means "this page is a torn page (partially written) or has a disk error."

Quipu-Log faces the same problem — but instead of fixed-size pages, it deals with variable-length records. So instead of page checksums, it uses a per-record frame header + CRC.

DB ↔ Filesystem

In a DB, fixed-size page headers hold checksums to detect torn pages. In Quipu-Log, every variable-length record is preceded by a frame header (length + CRC + timestamp) that handles both boundary detection and corruption in one shot.

Inside a file: MAGIC + header + frames

A segment file has this structure.

Segment file binary layout. The first 13 bytes are the header (MAGIC 4 + version 1 + base_index 8), followed by a stream of record frames. The frame header is a fixed 16 bytes (FRAME_HEADER).

Let's verify the exact constants in code.

crates/quipu-core/src/storage/segment.rspub const MAGIC: [u8; 4] = *b"ALOG";
pub const FORMAT_VERSION: u8 = 2;
pub const SEGMENT_HEADER: usize = MAGIC.len() + 1 + 8; // = 13

// Frame header: u32 length + u32 CRC + u64 timestamp
pub const FRAME_HEADER: usize = 4 + 4 + 8; // = 16
/// Maximum size of a single record: 64 MiB
pub const MAX_RECORD: u32 = 64 * 1024 * 1024;

MAGIC and FORMAT_VERSION: file identity check

The very first thing that happens when opening a file is an identity check. The code verifies that the first four bytes are ALOG and the next byte is 2 (the current format version).

crates/quipu-core/src/storage/segment.rs — read_headerif head[0..4] != MAGIC {
    return Err(Error::Corrupt { /* … */
        reason: "bad magic (not an audit segment file)".into(),
    });
}
let version = head[4];
if version != FORMAT_VERSION {
    return Err(Error::Corrupt { /* … */
        reason: format!("unsupported segment format version {version}"),
    });
}

Why two separate checks?

MAGIC (ALOG) — identifies "this is a Quipu-Log segment file." It catches files that accidentally ended up in the segment directory or are a completely different format, and rejects them fast.
FORMAT_VERSION — the same ALOG file can have a changed internal structure. The current implementation only reads version 2. Version 1 had per-record hash chaining, which was removed in v2 when the design moved to a Merkle spine. A version mismatch causes an immediate error rather than attempting a parse that would silently mangle data.

Analogy

MAGIC is the "TO:" address on the outside of an envelope; FORMAT_VERSION is the form-version number inside the letter. A mis-addressed envelope gets returned without opening; a letter on an old form gets the reply "sorry, we can't read this format anymore."

The three frame header fields: length, CRC, timestamp

Let's walk through each of the three frame header fields and what they actually do.

① u32 length (payload length)

How many bytes this record's payload is. The reader uses this value to know "I need to read N bytes and then this record is done," and the very next byte after that is the start of the next frame header. This is the most fundamental way to pack variable-length records end to end.

But a length field alone is dangerous — a crash that leaves a garbage value there could cause the reader to try to allocate a "1 GB record" and OOM. That's why the MAX_RECORD guard exists.

crates/quipu-core/src/storage/segment.rs — inside the skim functionlet len = u32::from_le_bytes(header[0..4].try_into().unwrap());
if len > MAX_RECORD || total - valid - (FRAME_HEADER as u64) < len as u64 {
    break; // length is unrealistic or extends past the end of the file → torn tail
}

MAX_RECORD is 64 MiB. Any "record" larger than that is treated as a crashed and corrupted length field, and reading stops there.

② u32 CRC32 (checksum)

A CRC32 hash of the payload bytes. After reading a record, the reader recomputes CRC32 over the payload. If it doesn't match the stored value, the record is flagged as corrupt.

crates/quipu-core/src/storage/segment.rs — appendlet crc = crc32fast::hash(payload);
self.writer.write_all(&(payload.len() as u32).to_le_bytes())?;
self.writer.write_all(&crc.to_le_bytes())?;
self.writer.write_all(&timestamp.to_le_bytes())?;
self.writer.write_all(payload)?;

Reading is symmetric.

crates/quipu-core/src/storage/segment.rs — SegmentReader::next_recordself.reader.read_exact(&mut buf)?;
if crc32fast::hash(&buf) != crc {
    return Err(Error::Corrupt { /* … */ reason: "crc mismatch".into() });
}

③ u64 timestamp

When the record was written (microseconds since Unix epoch). The interesting part is that this lives in the header, not inside the payload. That means the timestamp can be read without deserializing the payload at all. When retention needs to figure out a segment's time range ("delete segments older than 90 days"), it only needs to skim the frame headers — it never has to unwrap individual records.

CRC32: accidental corruption detection, nothing more

One point about CRC32 needs to be made absolutely clear.

Caution — CRC is not a security tool

CRC32 catches accidental corruption (disk bit flips, transmission errors, torn writes). It does not detect intentional tampering. An attacker with direct disk access can change the payload and then recompute and overwrite the CRC. Detecting intentional tampering is the Merkle history tree's job (Part 5).

The code comments say exactly the same thing.

crates/quipu-core/src/storage/segment.rs — segment docs// The segment CRC catches only accidental corruption.
// Tamper-evidence lives in the spine, not the segments.

CRC and Merkle are tools for different threats.

	CRC32 (segment frame)	Merkle tree (spine)
What it detects	Accidental bit corruption, torn writes	Intentional payload modification
Assumed adversary	None (physical faults only)	Insider with disk access
Cost	Very fast (hardware-accelerated CRC32)	Hash tree computation
Location	Each segment frame	Separate spine file

MAX_RECORD: the safety net that protects recovery

MAX_RECORD = 64 * 1024 * 1024 (64 MiB) is the threshold: "if a record claims to be larger than this, treat it as corrupt."

When a segment is opened right after a crash, the last record may be cut off mid-write — maybe only 2 of the 4 length bytes were written, leaving a completely wrong len field. Trusting that value and allocating memory for it causes an OOM. The MAX_RECORD guard prevents this. Any len above the limit is treated as a "torn tail," and recovery stops at the last valid record. Ch. 12, crash recovery covers this in detail.

Write order matters

Let's look at the write order in the append function again.

crates/quipu-core/src/storage/segment.rs — Segment::appendself.writer.write_all(&(payload.len() as u32).to_le_bytes())?; // 1. length
self.writer.write_all(&crc.to_le_bytes())?;                    // 2. CRC
self.writer.write_all(&timestamp.to_le_bytes())?;              // 3. timestamp
self.writer.write_all(payload)?;                                // 4. payload

Because length comes first, a reader scanning forward always knows "where does this record end." Because CRC comes before the payload, validation can happen immediately after the payload is read in. And because the timestamp is in the header, time information is available without parsing the payload at all.

Check yourself

① Why is CRC32 a "corruption detector" rather than a "tamper detector"? What would an attacker need to do to defeat it?
② What goes wrong during a torn write if there's no MAX_RECORD guard?
③ Explain the design decision to put the timestamp in the frame header rather than in the payload, connecting it to the retention logic in Ch. 17.