DJ provides powerful lineage visualization at both model and column levels, helping you understand data flow and dependencies across your dbt project.
The Data Explorer visualizes your dbt models as an interactive lineage graph, showing how models connect and depend on each other.
Note: The Data Explorer displays model-level lineage. For column-level lineage, see the Column Lineage section below.
- Press
Cmd+Shift+P/Ctrl+Shift+P - Type "DJ: Data Explorer"
- Press Enter
- Mac:
Cmd+Shift+L - Windows/Linux:
Ctrl+Shift+L
- Right-click a model file (
.model.json,.sql, or.yml) in the explorer - Select "DJ: Data Explorer"
Opening Data Explorer, navigating/zooming the graph, expanding nodes, and running models from the graph
The Data Explorer shows your dbt models as an interactive directed acyclic graph (DAG):
Visual Elements:
- Nodes: Each model is represented as a colored node
- Edges: Arrows show dependencies (data flow)
- Layers: Models organized by type (staging → intermediate → mart)
Example from jaffle_shop:
The lineage graph shows how int__sales__orders__enriched depends on:
stg__sales__orders__standardized(staging layer)stg__customers__profiles__clean(staging layer)stg__sales__stores__locations(staging layer)
And how mart__sales__reporting__revenue depends on int__sales__orders__enriched.
Pan and Zoom:
- Pan: Click and drag canvas
- Zoom: Scroll wheel or pinch gesture
- Fit View: Click "Fit to View" button
Node Selection:
- Click node to highlight upstream/downstream
- Click to open model file and switch to the model's lineage
- Hover for quick info tooltip
The Data Explorer toolbar provides additional controls for customizing your view:
Auto-Sync Toggle:
- Enable "Auto-sync" to automatically update the graph when you switch between model files
- When enabled, opening a different
.model.json,.sql, or.ymlfile updates the graph to show that model's lineage - Disabled by default to prevent unwanted graph changes
- Can also be configured via
dj.dataExplorer.autoRefreshsetting
Split View Mode:
- Click the split view icon to enable side-by-side layout
- Shows the lineage graph alongside query results or compilation logs
- Useful for comparing model structure with query output
- Click again to return to single-panel view
Refresh Button:
- Manually refresh the lineage graph to reflect latest changes
- Useful after running
dbt parseor modifying model dependencies - Icon animates while loading
By default, the Data Explorer displays a model with its immediate upstream (parent) and downstream (child) models. You can progressively explore the lineage by expanding individual nodes.
Expanding Nodes:
Each node displays expand buttons (+ icons) when it has additional upstream or downstream models not currently shown:
- Upstream expand button (left side): Shows models that feed into this model
- Downstream expand button (right side): Shows models that depend on this model
- Click the + button to reveal additional levels of lineage
- Expansion happens per-node, allowing you to explore specific branches
Example: Starting from mart__sales__reporting__revenue:
- Initially shows immediate upstream:
int__sales__orders__enriched - Click upstream expand on
int__sales__orders__enrichedto reveal:stg__sales__orders__standardizedstg__customers__profiles__cleanstg__sales__stores__locations
- Click upstream expand on staging models to reveal source tables:
raw_orders,raw_customers,raw_stores
This progressive expansion lets you focus on relevant parts of the lineage without overwhelming the view.
Each model node in the Data Explorer displays icon buttons for quick actions:
View Columns:
- Click the columns icon (grid icon) on any node
- Opens Column Lineage panel for that model
- Shows column-level dependencies and transformations
Compile Model:
- Click the gear icon on a model node
- Compiles the model and displays compilation logs
- Shows generated SQL in the results panel
- Note: Source nodes don't have compile/run buttons
Run Query:
- Click the play circle icon on a model node
- Executes a query against the compiled model
- If model needs compilation, automatically compiles first
- Shows query results with data preview and SQL view
Expand Lineage:
- Click the + icon on nodes that have additional upstream/downstream models
- Progressively reveals more of the lineage graph
- Left side expands upstream dependencies
- Right side expands downstream dependents
Select Model:
- Click on a node to select it and view details
- Opens the model file in the editor
- Model becomes the active context for the graph
The Data Explorer includes panels for viewing query results and compilation logs:
Query Results Panel:
- Appears when you run a query from a model node
- Data View: Shows query results in a tabular format
- Displays column names and row data
- Shows row count and execution time
- Supports scrolling through results
- SQL View: Displays the compiled SQL that was executed
- Syntax-highlighted SQL
- Shows the exact query sent to the database
- Useful for debugging and understanding generated SQL
- Toggle between Data and SQL views using the tab buttons
- Maximize/restore the panel for better visibility
Compilation Logs Panel:
- Shows real-time compilation output when compiling a model
- Displays dbt compilation messages and any errors
- Automatically appears when compilation starts
- Shows success/failure status with visual indicators
- Can be closed after reviewing the output
- Open any
.model.jsonfile - Click "Column Lineage" button in editor toolbar
- With
.model.jsonfile active - Run "DJ: Column Lineage"
- Click the columns icon button on a model node
- Column Lineage opens for that model
Opening Column Lineage, understanding transformation types (raw/passthrough/renamed/derived), and tracing column origins
Column Lineage traces individual columns through transformations:
Transformation Types:
-
Raw (⬜ Gray)
- Source columns from raw tables
- Not yet transformed
- Example: Columns in
raw_orders,raw_customers
-
Passthrough (🟢 Green)
- Column passes unchanged through transformation
- Example:
customer_id→customer_id
-
Renamed (🟡 Yellow)
- Column renamed but values unchanged
- Example:
id→customer_id
-
Derived (🟠 Orange)
- New column from expression, calculation, or aggregation
- Examples:
- Calculation:
first_name || ' ' || last_name→full_name - Aggregation:
SUM(amount)→total_revenue - Date extraction:
EXTRACT(YEAR FROM order_date)→order_year
- Calculation:
Example from jaffle_shop:
In mart__sales__reporting__revenue:
order_idis passthrough fromint__sales__orders__enrichedorder_yearis derived fromEXTRACT(YEAR FROM order_date)total_dollarsis passthrough (but originally derived in intermediate layer)
Backward Tracing:
- Click a column in the target model
- Graph highlights the column and its immediate upstream connections
- For derived columns: Hover over the derived badge area to see the transformation
- A tooltip displays the SQL expression or transformation logic
- Follow the highlighted path backwards to source
Viewing Transformations:
- Derived columns with expressions show a tooltip when you hover over the badge area
- The tooltip displays:
- SQL expression used to create the column
- Aggregation functions (SUM, COUNT, AVG, etc.)
- Calculations and formulas
- Date extractions and transformations
- Simple mappings (raw, passthrough, renamed) don't show transformation tooltips since the relationship is direct
Example: Tracing order_total_dollars in int__sales__orders__enriched:
- Shows it's derived from:
CAST(stg__sales__orders__standardized.order_total_cents AS DECIMAL(10,2)) / 100.0 - Which comes from
order_total_centsin staging - Which comes from
order_totalin the source tableraw_orders
Forward Tracing:
- Select column
- See all downstream uses
- Identify impacted models
Example: Tracing order_date forward shows it's used in:
mart__sales__reporting__revenue(for year/month/quarter extraction)int__sales__analytics__monthly_revenue(for rollup)int__sales__analytics__rolling_30day(for lookback window)
Visual Elements:
- Column Nodes: Individual columns displayed as separate nodes with color-coded badges
- Model Information: Each node shows the model name that contains the column
- Transformation Edges: Lines connecting columns across models
- Color Coding: Badges show transformation type (raw/passthrough/renamed/derived)
- Expression Tooltips: Hover over derived column badges to see transformation details
Viewing Details:
- Hover column node: Highlights the column and its immediate connections
- Hover derived badge: Shows SQL expression for derived columns in a tooltip
- Hover edge: Highlights immediately connected nodes only
DJ automatically expands bulk column selects:
all_from_model
{ "type": "all_from_model", "model": "stg__sales__orders__standardized" }Expands to all columns from specified model
dims_from_model
{ "type": "dims_from_model", "model": "stg__sales__orders__standardized" }Expands to dimension columns only
fcts_from_model
{ "type": "fcts_from_model", "model": "stg__sales__orders__standardized" }Expands to fact/measure columns only
Example from jaffle_shop:
In int__products__analytics__product_popularity:
{
"type": "dims_from_model",
"model": "stg__sales__items__order_details",
"include": ["product_sku", "product_type", "item_id"]
}Column Lineage shows this expands to three dimension columns and traces each back to its source.
From any file:
- Place cursor on column name in SQL or
.model.json - Run "DJ: Find Column Origin"
- Column Lineage opens with column highlighted
- Trace back to source table
Impact Analysis:
- "If I change this source column, what breaks?"
- Trace downstream to find affected models
Example: Changing ordered_at in raw_orders:
- Affects
order_dateinstg__sales__orders__standardized - Which affects
order_year,order_month,order_quarterinmart__sales__reporting__revenue - Which affects multiple dashboard queries
Debugging:
- "Where does this value come from?"
- Trace upstream to source table
Example: Investigating incorrect revenue_tier classification:
- Trace to expression:
CASE WHEN order_total_dollars >= 15.00 THEN 'High'... - Trace
order_total_dollarsto intermediate layer - Trace to cents-to-dollars conversion in staging
- Trace to
order_totalin source
Documentation:
- "What transformations are applied?"
- Export lineage graph for documentation
Optimization:
- "Is this column used downstream?"
- Identify unused columns to remove
Example: Check if tax_paid_cents is used:
- Trace forward through all models
- Find it's only used in
mart__sales__reporting__profitability - Safe to remove from other intermediate models
Both Data Explorer and Column Lineage support auto-refresh on editor tab changes.
How it works:
- Switch to a
.model.json,.sql, or.ymlfile - The active lineage view (Data Explorer or Column Lineage) detects the file change
- Graph updates to show the current model's context
Configuration:
{
// Auto-refresh Data Explorer on tab change
"dj.dataExplorer.autoRefresh": true,
// Auto-refresh Column Lineage on tab change
"dj.columnLineage.autoRefresh": true
}Defaults:
dj.dataExplorer.autoRefresh:false(disabled by default)dj.columnLineage.autoRefresh:true(enabled by default)
Large Projects (100+ models):
- Use progressive node expansion to explore specific branches
- Avoid expanding all nodes at once
- Disable auto-refresh if graph updates are slow
Deep Lineages (10+ levels):
- Expand nodes progressively rather than all at once
- Zoom in on specific sections of interest
- Focus on critical paths through your pipeline
Check manifest.json:
- Ensure dbt has been run:
dbt parsein terminal - Verify
target/manifest.jsonexists in your dbt project - Check for compilation errors in Output panel
Ensure model is compiled:
- Open
.model.jsonfile - Run "DJ: Compile Model" from Command Palette
- Check for validation errors in editor
Check dependencies:
- Verify upstream models exist
- Ensure manifest includes dependencies
- Run
dbt parsein terminal to rebuild manifest
DJ lineage graphs use dbt's manifest.json:
- Parses
target/manifest.jsonfor dependencies - Respects dbt's
ref()andsource()relationships - Updates automatically when manifest changes
- Compatible with all dbt features (packages, macros, etc.)
- Visual Editor Guide - Create models visually
- Tutorial - Build a complete pipeline
- Model Types - Understand model dependencies