Create a Validation Definition
A Validation Definition is a fixed reference that links a Batch of data to an Expectation Suite. It can be run by itself to validate the referenced data against the associated Expectations for testing or data exploration. Multiple Validation Definitions can also be provided to a Checkpoint which, when run, executes Actions based on the Validation Results for each provided Validation Definition.
Prerequisites
- Python version 3.9 to 3.12.
- An installation of GX Core.
- A preconfigured Data Context. In this guide the variable
context
is assumed to contain your Data Context. - A preconfigured Data Source, Data Asset, and Batch Definition connected to your data.
- A preconfigured Expectation Suite populated with Expectations.
Procedure
- Instructions
- Sample code
-
Retrieve an Expectation Suite with Expectations.
Update the value of
expectation_suite_name
in the following code with the name of your Expectation Suite. Then execute the code to retrieve that Expectation Suite:Pythonexpectation_suite_name = "my_expectation_suite"
expectation_suite = context.suites.get(name=expectation_suite_name) -
Retrieve the Batch Definition that describes the data to associate with the Expectation Suite.
Update the values of
data_source_name
,data_asset_name
, andbatch_definition_name
in the following code with the names of your previously defined Data Source, one of its Data Assets, and a Batch Definition for that Data Asset. Then execute the code to retrieve the Batch Definition:Pythondata_source_name = "my_data_source"
data_asset_name = "my_data_asset"
batch_definition_name = "my_batch_definition"
batch_definition = (
context.data_sources.get(data_source_name)
.get_asset(data_asset_name)
.get_batch_definition(batch_definition_name)
) -
Create a
ValidationDefinition
instance using the Batch Definition, Expectation Suite, and a unique name.Update the value of
definition_name
with a descriptive name that indicates the purpose of the Validation Definition. Then execute the code to create your Validation Definition:Pythondefinition_name = "my_validation_definition"
validation_definition = gx.ValidationDefinition(
data=batch_definition, suite=expectation_suite, name=definition_name
) -
Optional. Save the Validation Definition to your Data Context.
Pythonvalidation_definition = context.validation_definitions.add(validation_definition)
import great_expectations as gx
context = gx.get_context()
# Retrieve an Expectation Suite
expectation_suite_name = "my_expectation_suite"
expectation_suite = context.suites.get(name=expectation_suite_name)
# Retrieve a Batch Definition
data_source_name = "my_data_source"
data_asset_name = "my_data_asset"
batch_definition_name = "my_batch_definition"
batch_definition = (
context.data_sources.get(data_source_name)
.get_asset(data_asset_name)
.get_batch_definition(batch_definition_name)
)
# Create a Validation Definition
definition_name = "my_validation_definition"
validation_definition = gx.ValidationDefinition(
data=batch_definition, suite=expectation_suite, name=definition_name
)
# Add the Validation Definition to the Data Context
validation_definition = context.validation_definitions.add(validation_definition)