dbt revolucionizoval práci s daty. SQL modely, které dbt kompiluje a spouští — verzování, testování a dokumentace jako kód.
Co je dbt¶
dbt transformuje data ve warehouse pomocí SELECT statementů. Stará se o DDL, závislosti, testy a dokumentaci.
dbt modely¶
-- models/staging/stg_orders.sql
WITH source AS (
SELECT * FROM {{ source('raw', 'orders') }}
)
SELECT
id AS order_id,
user_id AS customer_id,
created_at AS order_date,
amount_cents / 100.0 AS amount_eur,
status
FROM source
WHERE status != 'test'
Testy¶
# schema.yml
version: 2
models:
- name: stg_orders
columns:
- name: order_id
tests: [unique, not_null]
- name: amount_eur
tests: [not_null]
Incremental modely¶
{{ config(materialized='incremental', unique_key='order_id') }}
SELECT * FROM {{ ref('stg_orders') }}
{% if is_incremental() %}
WHERE order_date > (SELECT MAX(order_date) FROM {{ this }})
{% endif %}
Shrnutí¶
dbt je standard pro transformace ve warehouse. SQL modely, testy a dokumentace jako kód.
dbtsqltransformaceanalytics engineering