brume.yml reference
brume.yml is the single declarative file that describes what to extract and how to transform it. It lives next to your .env and is loaded automatically by every Brume subcommand.
Top-level structure
Section titled “Top-level structure”extraction: # what to copy from the source fk_depth: 3 tables: - table: ... filter: ...
anonymization: # how to transform what's copied linked_columns: # optional, for cross-table consistency - semantic_key: ... columns: [ ... ] tables: - table: ... columns: - name: ... strategy: ... type: ... # only for FAKE / MASK json_paths: ... # only for JSONBextraction
Section titled “extraction”Controls what data is read from the source.
extraction.fk_depth
Section titled “extraction.fk_depth”Maximum number of foreign-key levels Brume traverses automatically — both parents (the row a FK points to) and children (rows that point to the current row).
extraction: fk_depth: 3Higher values produce a more complete subset but extract more rows. Typical values: 2–4.
extraction.tables
Section titled “extraction.tables”The list of root tables to extract. Brume starts from these and walks FKs.
extraction: tables: - table: orders filter: "created_at >= '2025-01-01'" - table: order_itemsEach entry accepts:
| Key | Required | Description |
|---|---|---|
table | yes | Table name (qualified with schema if not public, e.g. analytics.events) |
filter | no | Raw SQL WHERE clause applied to this table only |
The filter is injected as-is into the extraction query — it must be valid SQL for your Postgres version. Quote literals correctly.
anonymization
Section titled “anonymization”Controls how each column is transformed.
anonymization.linked_columns
Section titled “anonymization.linked_columns”Declares that several columns across different tables represent the same semantic value and must produce the same fake output — even when they aren’t joined by a formal foreign key.
anonymization: linked_columns: - semantic_key: user_email columns: - table: users column: email - table: audit_logs column: user_email - table: subscriptions column: notify_emailEach entry has:
| Key | Required | Description |
|---|---|---|
semantic_key | yes | Free-form identifier used in logs (e.g. user_email, client_ssn) |
columns[].table | yes | Table name |
columns[].column | yes | Column name |
All listed columns will receive the same FAKE value for the same source input.
anonymization.tables[]
Section titled “anonymization.tables[]”Per-table list of column rules.
anonymization: tables: - table: users columns: - name: id strategy: FPE_ID - name: email strategy: FAKE type: EMAILcolumns[].name
Section titled “columns[].name”The column name to transform.
columns[].strategy
Section titled “columns[].strategy”One of: FAKE, MASK, HASH, NULLIFY, FPE_ID, FPE_UUID, KEEP. See strategies.
columns[].type
Section titled “columns[].type”Required when strategy is FAKE or MASK. One of: EMAIL, FIRST_NAME, LAST_NAME, PHONE, IBAN, ADDRESS, IP_ADDRESS, JSONB. See semantic types.
columns[].json_paths
Section titled “columns[].json_paths”Required when type is JSONB. List of JSON paths to anonymize inside the document. Other paths are kept as-is.
- name: metadata strategy: FAKE type: JSONB json_paths: - path: $.contact.email type: EMAIL - path: $.contact.phone type: PHONE - path: $.shipping.address type: ADDRESSEach path entry accepts:
| Key | Required | Description |
|---|---|---|
path | yes | JSONPath ($.field.subfield) |
type | yes | Semantic type to apply at that path |
Validation rules
Section titled “Validation rules”Brume validates the configuration at startup and refuses to run on:
- A column with
strategy: FAKEorMASKwithout atype. - A column with
strategy: NULLIFYdeclared on aNOT NULLcolumn. - A
type: JSONBwithoutjson_paths. - A table referenced in
anonymization.tablesthat isn’t reachable fromextraction.tables(you’d be transforming nothing). - A
linked_columnsentry referencing a column also declared with a different strategy.
Run brume dry-run to catch all of these without touching the target.
Full example
Section titled “Full example”extraction: fk_depth: 3 tables: - table: orders filter: "created_at >= '2025-01-01'" - table: order_items
anonymization: linked_columns: - semantic_key: user_email columns: - table: users column: email - table: audit_logs column: user_email
tables: - table: users columns: - name: id strategy: FPE_ID - name: email strategy: FAKE type: EMAIL - name: phone strategy: MASK type: PHONE - name: notes strategy: NULLIFY - name: metadata strategy: FAKE type: JSONB json_paths: - path: $.shipping.address type: ADDRESS
- table: audit_logs columns: - name: user_email strategy: FAKE type: EMAIL - name: ip strategy: MASK type: IP_ADDRESS- CLI commands — how to run
brume plan,execute,dry-run. .envvariables — connections and secrets.