Home Knowledge Base dbt (Data Build Tool)

dbt (Data Build Tool) is the SQL-first transformation framework that brings software engineering best practices — version control, testing, documentation, and modular design — to data transformation pipelines — enabling analytics engineers to define data models as SELECT statements that dbt compiles, executes against the warehouse, and documents automatically, becoming the standard "T" in ELT pipelines.

What Is dbt?

Why dbt Matters for AI and Data Engineering

dbt Core Concepts

Models (SQL Transformations): -- models/staging/stg_orders.sql {{ config(materialized='view') }} -- or 'table', 'incremental'

SELECT order_id, customer_id, order_total, CAST(created_at AS DATE) AS order_date FROM {{ source('raw', 'orders') }} -- references raw source table

-- models/marts/customer_features.sql {{ config(materialized='table') }}

SELECT c.customer_id, COUNT(o.order_id) AS order_count, SUM(o.order_total) AS lifetime_value, AVG(o.order_total) AS avg_order_value, MAX(o.order_date) AS last_order_date FROM {{ ref('stg_customers') }} c -- ref() resolves dependency LEFT JOIN {{ ref('stg_orders') }} o ON c.customer_id = o.customer_id GROUP BY 1

Testing: -- models/staging/stg_orders.yml version: 2 models:

columns:

tests:

tests:

to: ref('stg_customers') field: customer_id

Incremental Models: {{ config(materialized='incremental', unique_key='order_id') }}

SELECT order_id, customer_id, order_total, created_at FROM {{ source('raw', 'orders') }}

{% if is_incremental() %} WHERE created_at > (SELECT MAX(created_at) FROM {{ this }}) {% endif %}

Macros (Reusable SQL Functions): -- macros/cents_to_dollars.sql {% macro cents_to_dollars(column_name) %} ({{ column_name }} / 100)::NUMERIC(10,2) {% endmacro %}

-- Usage in model: SELECT {{ cents_to_dollars('price_cents') }} AS price_dollars FROM orders

dbt Commands:

dbt vs Alternatives

ToolSQL-firstTestingDocsOrchestrationBest For
dbtYes (only SQL)Built-inAuto-generatedExternal (Airflow)Analytics engineering
Apache SparkNoCustomManualAirflow/PrefectBig data transforms
DataformYes (SQL+JS)Built-inGoodGCP-nativeGoogle Cloud teams
PandasNo (Python)CustomManualStandaloneAd-hoc analysis

dbt is the SQL transformation standard that brought software engineering discipline to the analytics stack — by treating SQL SELECT statements as version-controlled, tested, documented code artifacts rather than one-off scripts, dbt enables data teams to build reliable feature pipelines, training datasets, and business intelligence that maintain quality and reproducibility at enterprise scale.

dbttransformanalytics

Explore 500+ Semiconductor & AI Topics

From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.