Skip to main content
Version: 0.18.9

Add input validation and type checking for a Custom Expectation

Prerequisites

ExpectationsA verifiable assertion about data. will typically be configured using input parameters. These parameters are required to provide your Custom ExpectationAn extension of the `Expectation` class, developed outside of the Great Expectations library. with the context it needs to ValidateThe act of applying an Expectation Suite to a Batch. your data. Ensuring that these requirements are fulfilled is the purpose of type checking and validating your input parameters.

For example, we might expect the fraction of null values to be mostly=.05, in which case any value above 1 would indicate an impossible fraction of a single whole (since a value above one indicates more than a single whole), and should throw an error. Another example would be if we want to indicate that the mean of a row adheres to a minimum value bound, such as min_value=5. In this case, attempting to pass in a non numerical value should clearly throw an error!

This guide will walk you through the process of adding validation and Type Checking to the input parameters of the Custom Expectation built in the guide for how to create a Custom Column Aggregate Expectation. When you have completed this guide, you will have implemented a method to validate that the input parameters provided to this Custom Expectation satisfy the requirements necessary for them to be used as intended by the Custom Expectation's code.

Decide what to validate

As a general rule, we want to validate any of our input parameters and success keys that are explicitly used by our Expectation class. In the case of our example Expectation expect_column_max_to_be_between_custom, we've defined four parameters to validate:

  • min_value: An integer or float defining the lowest acceptable bound for our column max
  • max_value: An integer or float defining the highest acceptable bound for our column max
  • strict_min: A boolean value defining whether our column max is (strict_min=False) or is not (strict_min=True) allowed to equal the min_value
  • strict_max: A boolean value defining whether our column max is (strict_max=False) or is not (strict_max=True) allowed to equal the max_value
Details

What don't we need to validate? You may have noticed we're not validating whether the column parameter has been set. Great Expectations implicitly handles the validation of certain parameters universal to each class of Expectation, so you don't have to!

Define the Validation method

We define the validate_configuration(...) method of our Custom Expectation class to ensure that the input parameters constitute a valid configuration, and doesn't contain illogical or incorrect values. For example, if min_value is greater than max_value, max_value=True, or strict_min=Joe, we want to throw an exception. To do this, we're going to write a series of assert statements to catch invalid values for our parameters.

To begin with, we want to create our validate_configuration(...) method and ensure that a configuration is set:

Python
def validate_configuration(
self, configuration: Optional[ExpectationConfiguration] = None
) -> None:
"""
Validates that a configuration has been set, and sets a configuration if it has yet to be set. Ensures that
necessary configuration arguments have been provided for the validation of the expectation.
Args:
configuration (OPTIONAL[ExpectationConfiguration]): \
An optional Expectation Configuration entry that will be used to configure the expectation
Returns:
None. Raises InvalidExpectationConfigurationError if the config is not validated successfully
"""

# Setting up a configuration
super().validate_configuration(configuration)
configuration = configuration or self.configuration

Next, we're going to implement the logic for validating the four parameters we identified above.

Access parameters and writing assertions

First we need to access the parameters to be evaluated:

Python
min_value = configuration.kwargs["min_value"]
max_value = configuration.kwargs["max_value"]
strict_min = configuration.kwargs["strict_min"]
strict_max = configuration.kwargs["strict_max"]

Now we can begin writing the assertions to validate these parameters.

We're going to ensure that at least one of min_value or max_value is set:

Python
try:
assert (
min_value is not None or max_value is not None
), "min_value and max_value cannot both be none"

Check that min_value and max_value are of the correct type:

Python
assert min_value is None or isinstance(
min_value, (float, int)
), "Provided min threshold must be a number"
assert max_value is None or isinstance(
max_value, (float, int)
), "Provided max threshold must be a number"

Verify that, if both min_value and max_value are set, min_value does not exceed max_value:

Python
if min_value and max_value:
assert (
min_value <= max_value
), "Provided min threshold must be less than or equal to max threshold"

And assert that strict_min and strict_max, if provided, are of the correct type:

Python
assert strict_min is None or isinstance(
strict_min, bool
), "strict_min must be a boolean value"
assert strict_max is None or isinstance(
strict_max, bool
), "strict_max must be a boolean value"

If any of these fail, we raise an exception:

Python
except AssertionError as e:
raise InvalidExpectationConfigurationError(str(e))

Putting this all together, our validate_configuration(...) method should verify that all necessary inputs have been provided, that all inputs are of the correct types, that they have a correct relationship between each other, and that if any of these conditions aren't met, we raise an exception.

Verify your method

If you now run your file, print_diagnostic_checklist() will attempt to execute the validate_configuration(...) using the input provided in your Example Cases.

If your input is successfully validated, and the rest the logic in your Custom Expectation is already complete, you will see the following in your Diagnostic Checklist:

 ✔ Has basic input validation and type checking
✔ Custom 'assert' statements in validate_configuration

Congratulations!
🎉 You've successfully added input validation & type checking to a Custom Expectation! 🎉

Contribution (Optional)

The method implemented in this guide is an optional feature for Experimental Expectations, and a requirement for contribution to Great Expectations at Beta and Production levels.

If you would like to contribute your Custom Expectation to the Great Expectations codebase, please submit a Pull Request.

note

For more information on our code standards and contribution, see our guide on Levels of Maturity for Expectations.

To view the full script used in this page, see it on GitHub: