Source data freshness

Source data freshness

Monitoring source data freshenss.

Overview

Source freshness tests enable you to assess the timeliness of data in source tables. These checks are valuable for verifying the health of your upstream data before running your DAGs.

Setting up source freshness checks

In Y42, you can configure source freshness tests using the error_after parameter within the freshness block of a source YAML file. Providing a loaded_at_field is mandatory for calculating freshness.

These settings are hierarchical, meaning a configuration set at the source level will apply to all tables within that source unless overridden.

sources/pizza_shop.yml

_17
version: 2
_17
_17
sources:
_17
- name: pizza_shop
_17
database: raw
_17
_17
freshness: # default freshness
_17
error_after: {count: 24, period: hour}
_17
_17
loaded_at_field: _etl_loaded_at
_17
_17
tables:
_17
- name: customers # this will use the freshness defined above
_17
_17
- name: orders # this will use the more specific freshness below
_17
freshness: # make this a little more strict
_17
error_after: {count: 12, period: hour}