The Quipu-Log Book
Part 6 · Confidentiality: searching while keeping secrets (Security II)

24 · Field protection in four levels: plaintext/SHA-256/HMAC/RSA

Databases do support "column encryption." But there's a reason it rarely gets used in practice — the moment you encrypt a column, you can no longer search it with a WHERE clause. Quipu-Log sidesteps this dilemma with a different approach: pick the right protection level for each field, based on how it will be used. This chapter compares the four protection modes and lays out which one to choose for which kind of field.

In one sentence

Field protection is not a global switch. Each field picks one of None / Sha256 / Hmac / Rsa, and accepts the trade-off between searchability and key dependency that comes with it.

What databases do — the limits of TDE and column encryption

Relational databases have two main ways to protect sensitive data.

  • TDE (Transparent Data Encryption) — encrypts the entire disk file at the OS level. The DB engine decrypts transparently on reads and writes, so queries keep working as usual. The problem: it's completely transparent to any insider who can reach the DB process (a DBA, a log collector, anyone with backup-read access), so it offers no protection against insider threats.
  • Column encryption — encrypts specific column values before storing them. An insider who opens the file sees only ciphertext. The trade-off: an encrypted column can't be searched with WHERE col = 'Alice'. Comparing values requires decryption first, which means the DB engine has to hold the key — and the moment it does, insider protection collapses.
DB ↔ Filesystem

DB column encryption can't escape the dilemma of "encrypt it and lose search." Quipu-Log breaks through in two ways — one-way hashing lets you run exact-match queries without ever storing plaintext, and Ch. 26 blind index makes partial-match search possible on top of that. If you need both encryption and searchability, the next chapters explain how.

The four levels: what gets stored, what you get

Quipu-Log's FieldProtection has four levels. Let's start with the full picture in a comparison table.

Protection level What's stored on disk Searchability Key required Primary use
None plaintext as-is exact · contains · prefix — all of them none non-sensitive general fields
Sha256 SHA-256 digest (64-char hex) exact match only none medium-entropy values (email, medical record number, etc.)
Hmac HMAC-SHA-256 digest (64-char hex) + key_version exact match only HMAC key low-entropy values (SSN, national ID, phone number)
Rsa ciphertext (AES-256-GCM) + wrapped key + nonce none by default (blind index can be added) RSA public key (write) + private key (read) values that need to be decrypted (full name, address, etc.)

In code, these four levels are expressed as the FieldProtection enum.

crates/quipu-core/src/schema.rspub enum FieldProtection {
    None,
    Sha256,  // keyless — works without a key, vulnerable on low-entropy values
    Hmac,    // keyed — offline brute-force impossible without the HMAC key
    Rsa,     // hybrid AES-256-GCM + RSA-OAEP, decryptable
}

The protected result is persisted on disk as a StoredValue.

crates/quipu-core/src/model.rspub enum StoredValue {
    Plain(Value),                       // None → plaintext
    Sha256(String),                     // hex digest
    Hmac { key_version: u32, digest: String }, // keyed digest + version
    Rsa {
        key_version: u32,               // version of the public key used for encryption
        wrapped_key: String,           // AES key wrapped with RSA (b64)
        nonce: String, ciphertext: String,
    },
}

Walking through each level

plaintext value "Alice Smith" Plain("Alice Smith") None plaintext value "010-1234-5678" SHA-256 → "3a9f2b..." (hex) Sha256 plaintext value "123-45-6789" (SSN) HMAC + key → keyed digest Hmac plaintext value "Jane Doe" (name) AES-GCM ciphertext + RSA-wrapped key Rsa search ✓ all modes search ✓ exact only search ✓ exact only search ✗ (by default) no key no key HMAC key RSA public key ← decryptable with private key ← one-way, no decryption ← one-way, no decryption ← one-way, no decryption
FieldProtection's four levels: how the plaintext is transformed before storage, and what is and isn't possible afterward.

None — plaintext

Stored as-is, with no transformation. Use it for fields that aren't sensitive. All search modes work, and no key configuration is needed.

Sha256 — keyless one-way hash

Only the SHA-256 digest of the plaintext is stored. At search time, the query term is hashed with SHA-256 the same way, so exact-match search works. The original plaintext is not recoverable. The catch: with no key involved, an attacker who can read the disk can attempt a dictionary attack — this is dangerous for values with a finite search space, like phone numbers or national ID numbers.

Hmac — keyed one-way hash

An HMAC-SHA-256 digest is computed, and the key version used to produce it is stored alongside. At search time, the server hashes the query term with the same HMAC key and compares, so exact matching works. Without the key, offline dictionary attacks are impossible. This is exactly why low-entropy values like SSNs and phone numbers must use Hmac instead of Sha256.

Rsa — decryptable hybrid encryption

The value is encrypted with AES-256-GCM, and the AES key is wrapped with the RSA public key before storage. Only the holder of the private key can decrypt. Search is not available by default, but adding a Ch. 26 blind index enables prefix and n-gram search. The reasoning behind this unusual hybrid structure is explained in detail in Ch. 25.

Security note

SHA-256 without a key means anyone in the world can produce the same hash. The SHA-256 of a phone number like 01012345678 is identical everywhere, so an attacker can pre-compute hashes for every plausible value and build a lookup table (a rainbow table) to recover the original from the digest alone. HMAC mixes in a server-side secret key, which makes that kind of precomputation impossible from the start.

Protection only applies to registry fields

This is the point that causes the most confusion. Keep it close.

Caution

FieldProtection only applies to registry fields (entity attributes). The method, url, content, and custom columns of an audit log row are always plaintext. If you put sensitive information (PII, passwords, SSNs) into content or the URL, it will not be protected. Design sensitive values as registry fields with a protection level, and keep entity_id as an opaque identifier (a UUID or similar).

For example, if you're recording patient information, design the schema like this.

TypeSchema design examplestore.define_type(TypeSchema::new("patient", vec![
    FieldDef::text("ssn").protection(FieldProtection::Hmac).indexed(),
    FieldDef::text("name").protection(FieldProtection::Rsa),
    FieldDef::text("mrn").protection(FieldProtection::Sha256).indexed(),
]))?;
// entity_id should be an opaque id (e.g. "patient-uuid-001") — never use the SSN directly
// do not put PII into log columns like content or url

Recap

  • Protection level is a per-field choice — not a global switch; pick the right level for each field's purpose.
  • None is plaintext, Sha256 is keyless one-way, Hmac is keyed one-way, Rsa is decryptable. Searchability: the first three support exact match (Rsa has none by default).
  • Use Hmac for low-entropy values — Sha256 is vulnerable to offline dictionary attacks.
  • Protection applies only to registry fields — log columns (method/url/content) are always plaintext.
  • If you need both encryption and search → Ch. 25 hybrid encryption and Ch. 26 blind index.
Check yourself

① Why do TDE and column encryption in a DB create a "can't search it" problem? How do Quipu-Log's Sha256 and Hmac modes get around it?
② Explain in one sentence why protecting an SSN with Sha256 is risky and why Hmac is the right choice instead.
③ Why is putting a patient's name into the log's url field a problem? What's the correct design?