Data Checker Rule Result Capture & Context

Bryce Avery Updated by Bryce Avery

Purpose

This document outlines the requirements and expected outcomes for capturing detailed, contextual metadata alongside data validation results. The goal is to allow Data Stewards to quickly identify the root cause and ownership of data issues, improving resolution time and data quality governance.

Audience

  • Data Engineers and Developers implementing the Data Checker solution.
  • Data Stewards responsible for data quality management and issue resolution.

Prerequisites

  • Access to the Data Checker validation framework configuration.
  • Understanding of the data source systems and entities being validated.
  • Reference to the tracking spreadsheets for context.

Process Steps: Data Result Implementation

The following steps define the technical and functional requirements for implementing the new result capture process, based on the Acceptance Criteria (AC).

Step

Action

Details / Expected Outcome

1

Capture Core Metadata

For every validation result (record flagged by a rule), the system must capture the following metadata

- Rule and Rule Description (the check that was triggered).

- Entity (e.g., Student, Staff) and Affected Records (key identifier of the record).

- Source System (the system of origin for the data)

- Timestamp (when the validation occurred)5.

2

Capture Severity Context

The result capture process must categorize and group all results based on Severity. The three required categories are:

- Error (Critical issues requiring immediate fix).

- Warning (Potential issues that need review but aren't blocking).

- Info (Informational messages or minor deviations).

3

Enable Root Cause Linking

The system must provide a mechanism to link multiple individual validation instances to a single root cause.

Example: If 500 records fail because of one bad value in a shared code table, all 500 results should be linked to that single code table issue.

4

Data Steward Handoff

Once results are captured with context, the information must be presented in a format that is accessible to the Data Steward to:

a) Clearly understand the root cause (Rule + Context).

b) Quickly assign ownership to the relevant system or team.

Verification and Acceptance Criteria

A successful implementation meets the following criteria for Data Stewards:

  • Validation results are clearly displayed alongside metadata (Rule ID, description, affected records, system of origin).

The results are grouped and filterable by severity (Error, Warning, Info).

There is a function or field that allows for a many-to-one relationship between flagged records and a common root cause, such as a bad code table value.

How did we do?

Contact