Skip to content
Quickstart

How Brume works

Brume is a CLI that copies a subset of a source PostgreSQL database to a target (another Postgres instance or a .sql file), transforming personal data on the fly while preserving referential integrity.

┌──────────────┐ ┌─────────────────┐ ┌──────────────┐
│ SOURCE │───▶│ TRANSFORM │───▶│ TARGET │
│ Postgres │ │ per column │ │ Postgres │
│ (read-only) │ │ (deterministic)│ │ or .sql │
└──────────────┘ └─────────────────┘ └──────────────┘
│ │ │
FK traversal hmac-secret + fpe-key FK preserved
row filters per-column strategy no triggers off

Three phases:

  1. Extract — walk the schema starting from the tables in extraction.tables, follow foreign keys up to fk_depth levels (parents and children), apply any filter clauses.
  2. Transform — for each row, apply the per-column strategy declared in anonymization.tables[].columns. Undeclared columns are copied as-is (KEEP).
  3. Write — push to the target database or to a portable .sql file.

Two concepts you’ll see everywhere:

What it controlsWhere it lives
StrategyHow the value is transformed (FAKE, MASK, HASH, NULLIFY, FPE_ID, FPE_UUID, KEEP)Required on every declared column
TypeWhat kind of value to produce (EMAIL, PHONE, IBAN, …)Required only for FAKE and MASK

A column is configured as a (strategy, type?) pair. See strategies and semantic types for the full picture.

Every transformation is keyed by your secrets (hmac-secret and fpe-key) and deterministic:

Same input + same secret = same output, every time.

This is what makes Brume usable in real workflows:

  • Stable tests — your test fixtures don’t change between runs.
  • Joinable datasets — a HASH-ed email in two tables hashes to the same value, so joins still work.
  • Reproducible debug sessions — a bug you reproduced yesterday on users.id = FPE_ID(42) = 7831 is still at 7831 tomorrow.

Determinism is what differentiates a pseudonymization from a random-noise anonymizer.

When you declare id of users as FPE_ID, Brume detects every foreign key that points to users.id (e.g. orders.user_id) and rewrites it with the same encryption. The result: FK constraints on the target hold without disabling triggers.

For values that should match across tables that aren’t joined by a formal FK (e.g. an email shared between users.email and audit_logs.user_email), use linked_columns.

GDPR Article 4.5 defines pseudonymization as processing such that the data can no longer be attributed to a specific person without the use of additional information kept separately. In Brume, that “additional information” is exactly:

  • BRUME_HMAC_SECRET — seeds FAKE, HASH, and linked_columns.
  • BRUME_FPE_KEY — keys the format-preserving encryption used by FPE_ID and FPE_UUID.

Protect these at the same level as the source data. Their leak is not “just a config leak” — it invalidates the pseudonymization itself.

Brume can write to two kinds of target:

  • Another PostgreSQL database — fastest path, ideal for dev / staging / debug environments.
  • A .sql file — portable, shippable, importable with psql. Useful for CI artifacts or DPO review.

The choice is configured in .env — see the .env reference.