Version: 1.11.0

Create an Expectation

An Expectation is a verifiable assertion about your data. Expectations make implicit assumptions about your data explicit, and they provide a flexible, declarative language for describing expected behavior. They can help you better understand your data and help you improve data quality.

Prerequisites

Procedure

Instructions
Sample code

Choose an Expectation to create.

GX comes with many built in Expectations to cover your data quality needs. You can find a catalog of these Expectations in the Expectation Gallery. When browsing the Expectation Gallery you can filter the available Expectations by the data quality issue they address and by the Data Sources they support. There is also a search bar that will let you filter Expectations by matching text in their name or description.

In your code, you will find the classes for Expectations in the expectations module:
Python
```
from great_expectations import expectations as gxe
```
Determine the Expectation's required parameters

To determine the parameters your Expectation uses to evaluate data, reference the Expectation's entry in the Expectation Gallery. Under the Args section you will find a list of parameters that are necessary for the Expectation to be evaluated, along with the a description of the value that should be provided.

Parameters that indicate a column, list of columns, table, Data Source, or severity must be provided when the Expectation is created. All other parameters can be set when the Expectation is created or be assigned a dictionary lookup that will allow them to be set at runtime.
Optional. Determine the Expectation's other parameters

In addition to the parameters that are required for an Expectation to evaluate data, Expectations also support some optional parameters. In the Expectations Gallery these are found under each Expectation's Other Parameters section.

These parameters are:
- meta: A dictionary of user-supplied metadata to store with an Expectation. This dictionary can be used to add notes about the purpose and intended use of an Expectation.
- mostly: A special argument that allows for fuzzy validation based on a percentage of successfully validated rows. If the percentage is at least the value set in the mostly parameter, the Expectation will return a success value of true.
- severity: Indicates the impact of the Expectation failing. Accepted values are critical, warning, or info. Defaults to critical if not explicitly set. You can trigger Actions based on severity levels or you can condition your data pipeline with the get_maximum_severity_failure helper method in the ExpectationSuiteValidationResult class. Note that if an Expectation fails to execute, the failure will be recorded as critical, regardless of the Expectation configuration, to bring your attention to the fact that your data is not being tested as intended.
Create the Expectation.

Using the Expectation class you picked and the parameters you determined when referencing the Expectation Gallery, you can create your Expectation.
- Preset parameters
- Runtime parameters
In this example the ExpectColumnMaxToBeBetween Expectation is created and all of its parameters are defined in advance while leaving strict_min and strict_max as their default values:
Python
preset_expectation = gx.expectations.ExpectColumnMaxToBeBetween( column="passenger_count", min_value=1, max_value=6, severity="warning" )
Runtime parameters are provided by passing a dictionary to the expectation_parameters argument of a Checkpoint's run() method.
To indicate which key in the expectation_parameters dictionary corresponds to a given parameter in an Expectation you define a lookup as the value of the parameter when the Expectation is created. This is done by passing in a dictionary with the key $PARAMETER when the Expectation is created. The value associated with the $PARAMETER key is the lookup used to find the parameter in the runtime dictionary.
In this example, ExpectColumnMaxToBeBetween is created for both the passenger_count and the fare fields, and the values for min_value and max_value in each Expectation will be passed in at runtime. To differentiate between the parameters for each Expectation a more specific key is set for finding each parameter in the runtime expectation_parameters dictionary:
Python
passenger_expectation = gx.expectations.ExpectColumnMaxToBeBetween( column="passenger_count", min_value={"$PARAMETER": "expect_passenger_max_to_be_above"}, max_value={"$PARAMETER": "expect_passenger_max_to_be_below"}, ) fare_expectation = gx.expectations.ExpectColumnMaxToBeBetween( column="fare", min_value={"$PARAMETER": "expect_fare_max_to_be_above"}, max_value={"$PARAMETER": "expect_fare_max_to_be_below"}, )
The runtime expectation_parameters dictionary for the above example would look like:
Python
runtime_expectation_parameters = { "expect_passenger_max_to_be_above": 4, "expect_passenger_max_to_be_below": 6, "expect_fare_max_to_be_above": 10.00, "expect_fare_max_to_be_below": 500.00, }

Python
import great_expectations as gx

context = gx.get_context()
set_up_context_for_example(context)

# All Expectations are found in the `gx.expectations` module.
# This Expectation has all values set in advance:
preset_expectation = gx.expectations.ExpectColumnMaxToBeBetween(
    column="passenger_count", min_value=1, max_value=6, severity="warning"
)

# In this case, two Expectations are created that will be passed
#  parameters at runtime, and unique lookups are defined for each
#  Expectations' parameters.

passenger_expectation = gx.expectations.ExpectColumnMaxToBeBetween(
    column="passenger_count",
    min_value={"$PARAMETER": "expect_passenger_max_to_be_above"},
    max_value={"$PARAMETER": "expect_passenger_max_to_be_below"},
)
fare_expectation = gx.expectations.ExpectColumnMaxToBeBetween(
    column="fare",
    min_value={"$PARAMETER": "expect_fare_max_to_be_above"},
    max_value={"$PARAMETER": "expect_fare_max_to_be_below"},
)

# A dictionary containing the parameters for both of the above
#   Expectations would look like:
runtime_expectation_parameters = {
    "expect_passenger_max_to_be_above": 4,
    "expect_passenger_max_to_be_below": 6,
    "expect_fare_max_to_be_above": 10.00,
    "expect_fare_max_to_be_below": 500.00,
}

Prerequisites​

Procedure​

Prerequisites

Procedure