Skip to content
Quickstart

brume.yml reference

brume.yml is the single declarative file that describes what to extract and how to transform it. It lives next to your .env and is loaded automatically by every Brume subcommand.

extraction: # what to copy from the source
fk_depth: 3
tables:
- table: ...
filter: ...
anonymization: # how to transform what's copied
linked_columns: # optional, for cross-table consistency
- semantic_key: ...
columns: [ ... ]
tables:
- table: ...
columns:
- name: ...
strategy: ...
type: ... # only for FAKE / MASK
json_paths: ... # only for JSONB

Controls what data is read from the source.

Maximum number of foreign-key levels Brume traverses automatically — both parents (the row a FK points to) and children (rows that point to the current row).

extraction:
fk_depth: 3

Higher values produce a more complete subset but extract more rows. Typical values: 2–4.

The list of root tables to extract. Brume starts from these and walks FKs.

extraction:
tables:
- table: orders
filter: "created_at >= '2025-01-01'"
- table: order_items

Each entry accepts:

KeyRequiredDescription
tableyesTable name (qualified with schema if not public, e.g. analytics.events)
filternoRaw SQL WHERE clause applied to this table only

The filter is injected as-is into the extraction query — it must be valid SQL for your Postgres version. Quote literals correctly.

Controls how each column is transformed.

Declares that several columns across different tables represent the same semantic value and must produce the same fake output — even when they aren’t joined by a formal foreign key.

anonymization:
linked_columns:
- semantic_key: user_email
columns:
- table: users
column: email
- table: audit_logs
column: user_email
- table: subscriptions
column: notify_email

Each entry has:

KeyRequiredDescription
semantic_keyyesFree-form identifier used in logs (e.g. user_email, client_ssn)
columns[].tableyesTable name
columns[].columnyesColumn name

All listed columns will receive the same FAKE value for the same source input.

Per-table list of column rules.

anonymization:
tables:
- table: users
columns:
- name: id
strategy: FPE_ID
- name: email
strategy: FAKE
type: EMAIL

The column name to transform.

One of: FAKE, MASK, HASH, NULLIFY, FPE_ID, FPE_UUID, KEEP. See strategies.

Required when strategy is FAKE or MASK. One of: EMAIL, FIRST_NAME, LAST_NAME, PHONE, IBAN, ADDRESS, IP_ADDRESS, JSONB. See semantic types.

Required when type is JSONB. List of JSON paths to anonymize inside the document. Other paths are kept as-is.

- name: metadata
strategy: FAKE
type: JSONB
json_paths:
- path: $.contact.email
type: EMAIL
- path: $.contact.phone
type: PHONE
- path: $.shipping.address
type: ADDRESS

Each path entry accepts:

KeyRequiredDescription
pathyesJSONPath ($.field.subfield)
typeyesSemantic type to apply at that path

Brume validates the configuration at startup and refuses to run on:

  • A column with strategy: FAKE or MASK without a type.
  • A column with strategy: NULLIFY declared on a NOT NULL column.
  • A type: JSONB without json_paths.
  • A table referenced in anonymization.tables that isn’t reachable from extraction.tables (you’d be transforming nothing).
  • A linked_columns entry referencing a column also declared with a different strategy.

Run brume dry-run to catch all of these without touching the target.

extraction:
fk_depth: 3
tables:
- table: orders
filter: "created_at >= '2025-01-01'"
- table: order_items
anonymization:
linked_columns:
- semantic_key: user_email
columns:
- table: users
column: email
- table: audit_logs
column: user_email
tables:
- table: users
columns:
- name: id
strategy: FPE_ID
- name: email
strategy: FAKE
type: EMAIL
- name: phone
strategy: MASK
type: PHONE
- name: notes
strategy: NULLIFY
- name: metadata
strategy: FAKE
type: JSONB
json_paths:
- path: $.shipping.address
type: ADDRESS
- table: audit_logs
columns:
- name: user_email
strategy: FAKE
type: EMAIL
- name: ip
strategy: MASK
type: IP_ADDRESS