How Brume works
Brume is a CLI that copies a subset of a source PostgreSQL database to a target (another Postgres instance or a .sql file), transforming personal data on the fly while preserving referential integrity.
The pipeline
Section titled “The pipeline”┌──────────────┐ ┌─────────────────┐ ┌──────────────┐│ SOURCE │───▶│ TRANSFORM │───▶│ TARGET ││ Postgres │ │ per column │ │ Postgres ││ (read-only) │ │ (deterministic)│ │ or .sql │└──────────────┘ └─────────────────┘ └──────────────┘ │ │ │ FK traversal hmac-secret + fpe-key FK preserved row filters per-column strategy no triggers offThree phases:
- Extract — walk the schema starting from the tables in
extraction.tables, follow foreign keys up tofk_depthlevels (parents and children), apply anyfilterclauses. - Transform — for each row, apply the per-column strategy declared in
anonymization.tables[].columns. Undeclared columns are copied as-is (KEEP). - Write — push to the target database or to a portable
.sqlfile.
Strategy vs Type
Section titled “Strategy vs Type”Two concepts you’ll see everywhere:
| What it controls | Where it lives | |
|---|---|---|
| Strategy | How the value is transformed (FAKE, MASK, HASH, NULLIFY, FPE_ID, FPE_UUID, KEEP) | Required on every declared column |
| Type | What kind of value to produce (EMAIL, PHONE, IBAN, …) | Required only for FAKE and MASK |
A column is configured as a (strategy, type?) pair. See strategies and semantic types for the full picture.
Determinism — the core property
Section titled “Determinism — the core property”Every transformation is keyed by your secrets (hmac-secret and fpe-key) and deterministic:
Same input + same secret = same output, every time.
This is what makes Brume usable in real workflows:
- Stable tests — your test fixtures don’t change between runs.
- Joinable datasets — a
HASH-ed email in two tables hashes to the same value, so joins still work. - Reproducible debug sessions — a bug you reproduced yesterday on
users.id = FPE_ID(42) = 7831is still at7831tomorrow.
Determinism is what differentiates a pseudonymization from a random-noise anonymizer.
Foreign keys are preserved automatically
Section titled “Foreign keys are preserved automatically”When you declare id of users as FPE_ID, Brume detects every foreign key that points to users.id (e.g. orders.user_id) and rewrites it with the same encryption. The result: FK constraints on the target hold without disabling triggers.
For values that should match across tables that aren’t joined by a formal FK (e.g. an email shared between users.email and audit_logs.user_email), use linked_columns.
Secrets — the additional information
Section titled “Secrets — the additional information”GDPR Article 4.5 defines pseudonymization as processing such that the data can no longer be attributed to a specific person without the use of additional information kept separately. In Brume, that “additional information” is exactly:
BRUME_HMAC_SECRET— seedsFAKE,HASH, andlinked_columns.BRUME_FPE_KEY— keys the format-preserving encryption used byFPE_IDandFPE_UUID.
Protect these at the same level as the source data. Their leak is not “just a config leak” — it invalidates the pseudonymization itself.
Outputs
Section titled “Outputs”Brume can write to two kinds of target:
- Another PostgreSQL database — fastest path, ideal for
dev/staging/ debug environments. - A
.sqlfile — portable, shippable, importable withpsql. Useful for CI artifacts or DPO review.
The choice is configured in .env — see the .env reference.
- Pseudonymization strategies — what each strategy produces and when to use it.
- Semantic types — the values
FAKEandMASKcan generate. brume.ymlreference — every key, every option.