Overview

Transforming data with dbt

Transforming data within Y42 leverages the powerful capabilities of dbt (opens in a new tab) (data build tool), an open-source platform designed for data transformation in your data warehouse. By using dbt, teams can define, maintain, and test data assets through modular SQL queries. This approach ensures high-quality data models and simplifies the management of data transformations.

Y42 adopts a unique asset execution strategy - Virtual Data Builds, providing a streamlined, cost-efficient approach to running dbt models. By integrating Git with its inner workings -- versioning control code and data together, Y42's build execution engine ensures the code and data remain in sync across time.

There are several types of transform assets:

  • dbt Models: Models are SQL queries that define your transformed data. Models are SQL queries that define your transformed data. Models can be chained by referencing other models or source assets.
  • Snapshots: Snapshots capture and track changes in your data over time, addressing the challenge of managing Slowly Changing Dimensions (type 2).
  • Exposures: Exposures provide a structured way to reference downstream data usage, such as in dashboards, applications, or data science projects. Exposures link together relevant upstream assets.
  • Analyses: Analyses offer a flexible way for executing SQL statements without materializing the result set in your warehouse.