Skip to main content
Version: 1.3.0

Create an Expectation

An Expectation is a verifiable assertion about your data. Expectations make implicit assumptions about your data explicit, and they provide a flexible, declarative language for describing expected behavior. They can help you better understand your data and help you improve data quality.

Prerequisites

Procedure

  1. Choose an Expectation to create.

    GX comes with many built in Expectations to cover your data quality needs. You can find a catalog of these Expectations in the Expectation Gallery. When browsing the Expectation Gallery you can filter the available Expectations by the data quality issue they address and by the Data Sources they support. There is also a search bar that will let you filter Expectations by matching text in their name or description.

    In your code, you will find the classes for Expectations in the expectations module:

    Python
    from great_expectations import expectations as gxe
  2. Determine the Expectation's required parameters

    To determine the parameters your Expectation uses to evaluate data, reference the Expectation's entry in the Expectation Gallery. Under the Args section you will find a list of parameters that are necessary for the Expectation to be evaluated, along with the a description of the value that should be provided.

    Parameters that indicate a column, list of columns, or a table must be provided when the Expectation is created. The value in these parameters is used to differentiate instances of the same Expectation class. All other parameters can be set when the Expectation is created or be assigned a dictionary lookup that will allow them to be set at runtime.

  3. Optional. Determine the Expectation's other parameters

    In addition to the parameters that are required for an Expectation to evaluate data all Expectations also support some standard parameters that determine how strictly Expectations are evaluated and permit the addition of metadata. In the Expectations Gallery these are found under each Expectation's Other Parameters section.

    These parameters are:

    ParameterPurpose
    metaA dictionary of user-supplied metadata to store with an Expectation. This dictionary can be used to add notes about the purpose and intended use of an Expectation.
    mostlyA special argument that allows for fuzzy validation of ColumnMapExpectations and MultiColumnMapExpectations based on a percentage of successfully validated rows. If the percentage is high enough, the Expectation will return a success value of true.
  4. Create the Expectation.

    Using the Expectation class you picked and the parameters you determined when referencing the Expectation Gallery, you can create your Expectation.

    In this example the ExpectColumnMaxToBeBetween Expectation is created and all of its parameters are defined in advance while leaving strict_min and strict_max as their default values:

    Python
    preset_expectation = gx.expectations.ExpectColumnMaxToBeBetween(
    column="passenger_count", min_value=1, max_value=6
    )